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PREFACE 

A  great  deal  oi  progress  has  been  made  in  the  development  and  ap¬ 
plication  of  the  theory  cf  prediction  and  estimation  a..  noise  for  the 
case  of  linear  problems.  A  major  research  frontier  today  concerns  the 
problems  of  estimation  using  noise -corrupted  observations  in  nonlinear 
systems.  In  this  Memorandum  a  theoretical  approach  to  maximum-likelihood 
ediction  and  estimation  is  developed  for  certain  such  nonlinear  situ¬ 
ations  and  is  applied  to  the  special  case  of  numerically  estimating  the 
initial  conditions  of  a  radar-observed  reentry  body. 

The  work  reported  here  is  part  of  Rand's  continuing  basic  research 
in  engineering  and  systems  science. 


-v- 


! 

j 

|  SUMMARY 

i 

f 

It  is  often  desirable  to  estimate  certain  parameters,  such  as  in¬ 
itial  conditions,  of  a  dynamic  physical  process,  using  nc  •'  se  -cor  rupted 
:  observations.  This  Memorandum  describes  a  maximum-likelihood  solution 

i  when  the  process  is  time -invariant  nonlinear  m-vector  and  the  observa¬ 

tion  is  a  linear  combination  of  true  process  coordinates  and  noise.  It 
is  applied  to  a  white-gaussian-noise  case  in  which  the  initial  conditions 
have  a  known  a  priori  joint  gaussian  distribution  and  all  other  process 
parameters  are  known.  It  is  desired  to  estimate  the  state  at  t  ^  0, 
given  continuous  observation  over  (0,T)  and  a  priori  statistics. 

Given  a  maximum-1 ikelihood  estimate  of  initial  conditions,  a  nat¬ 
ural  estimate  for  the  system  state  at  t  >  0  simply  updates  the  process 
differential  equation  from  the  estimated  initial  conditions.  The  al¬ 
gorithm  therefore  estimates  initial  conditions.  It  can  be  shown  that 
the  maximum- 1  ike i ihood  estimate  is  th  initial  condition  which  mini¬ 
mizes  a  certain  functional  on  that  initial  condition,  the  observation, 
and  the  a  priori  statistics.  This  functional  describes  an  (m  4-  1)  sur¬ 
face  at  time  T,  and  the  desired  estimate  corresponds  to  its  minimum;  a 
differential  equation  is  developed  which  governs  the  evolution  of  this 
estimate  with  time  T. 

Using  this  differential  equation,  the  algorithm  calculates  as  a 
function  of  T  that  maximum-likelihood  solution  which  evolves  from  the 
unique  solution  at  time  T  ■  0  (given  by  the  a  priori  mean  vector). 

Using  this  differential -equation  algorithm  to  stay  near  the  desired 
solution  when  updating  T,  Newtonian  techniques  may  be  ueed  to  Improve 
the  solution  for  constant  T. 
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Earlier  computational  results  suggest  that  maximum  likelihood  in 
preferred  to  various  approximations.  This  Memorandum  explores  computing 
feasibility  for  difficult  vector  cases,  rather  than  extending  these  nu¬ 
merical  comparisons  for  simpler  scalar  situations.  Both  pure  differen¬ 
tial  equation  and  di fferentiai -equation/Newtonian  solutions  were  used 
to  estimate  reentry-vehicle  initial  conditions.  A  priori  statistics 
presumably  result  from  prior  less  accurate  tracking;  the  observation 
from  a  more  accurate  tracking  system.  One  coordinate  time  rate  is  not 
measured,  and  its  initial  condition  serves  as  an  unknown  parameter  with 
a  priori  statistics. 

A  fixed  reentry-path  observation  was  generated  using  Monte  Carlo 
noise,  and  initial  conditions  were  estimated.  Convergence  depended  upon 
the  size  of  integration  constants  used.  The  estimate  converged  to  within 
•-he  order  of  the  Cramer-Rao  conditional  bound.  Accuracy  was  exchanged 
between  coordinates  to  improve  the  overall  estimate.  The  initial  angle 
rate,  which  could  be  viewed  as  an  unknown  parameter,  was  handled  with 
the  same  or  better  rapidity  and  accuracy  of  convergence  than  the  other 
Initial  conditions.  Readily  available  programs  and  the  use  of  a  general- 
purpose  computer  resulted  In  processing  times  much  larger  than  real-time 
observation.  However,  efficient  programs  and  special  computers  could 
reduce  this  to  on-line  realization. 
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SYMBOLS 


A^  «  reentry-vehicle  frontal  (drag)  area 
A(X)  ,G(X)  ,J (X)  **  functionals  on  H(a)  ,  X(a;X),  R,  and  Z(«)  ,  0  s  <  T 
3y  =  unit  vector  along  the  reentry-vehicle  line  of  flight 
B(Xq,T)  «  matrix  function  of  Xq  and  T 

0^  reentry-vehicle  drag  coefficient 
D  =  drag-force  vector  acting  on  reentry  vehicle 
Detf*"]  •  determinant 
E(  • )  »  expectation 

E ( *  j Y)  ■  conditional  exoectation,  given  Y 

F  *  total-force  vector  acting  on  reentry  vehicle 


-  FfX  ,cr^  -  FTX^  -  - 


MX;,;ri/ 


f(X,T)  •>  functional  on  u,  A,  aM  Z(a)  ,  0  <  s  •;  T 


G  -  factor  in  error  atatiatlc 


C  *  gravity-force  vector  acting  on  reentry  vehicle 

g  *  gravitational  acceleration 

k.  A  *■ 

g  (a,X)  -  function  related  to  r^(a,u),  H^(u) ,  and  X(u;X) 

H^(a)  -  itl1  row  of  H(a) 


Hj  (a) 


W 


-  Beatrix  which  define*  the  effect  of  each  coordinate  on 
the  observation 


I  -  identity  matrix 
X  -  normalising  constant 


L(a)  “  vector  Wiener  ^rocaaa 
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M(i;X) 

m 

k  A 
m*(X) 

N 

N{s) 

n 

P(-) 

R(i) 

R(?,s) 

R(',s) 

r 

r . 

.1 

T 

t,s,u 

V 

W 

* 

X 

X(0) 

X(b;X) 

X(T) 

X(t) 


A 

conditional  expectation  of  Z(i),  given  X 
reentry-vehicle  mass 

A 

expectation  of  random  variable  related  to  H  (e)  ,  X(s;X) 

T/AT  =  number  of  updatings  in  T 

noise  in  observation  at  s ,  0  s  s  £  T 

number  of  Newtonian  approximations 

probability 

covariance  matrix  for  Z{ i) 


covariance  of  the  noise 


-  cc  3rian-e  of  white  noise 


radial  distance  to  reentry  vehicle 
unit  radial  vector 

upper  l..dex  limit  of  observation  interval 
process  index,  0<s,u<T,  0<t 
reer  ry-venicle  viocity 
weight  of  reentry  vehicle 
transpose  of  matrix  X 

Xq  =  true  initial  condition  of  X(t)  at  t  ~  0 
X(s),  given  that  X(0)  *  X 


X(0;T)  =  X„ 


'X,(0^ 

j,  physical  process 

'  (c)  / 

m  / 


t 

> 


/x.(r;T)\ 

X(t;T)  -I  i=  maximum-likelihood  estimate  of  true  X(t)  , 

(t;T)  /  1  >  °>  S^en  Z(s)  -  0  5  s  <  T 


Z(i)  =  I  I-  random  vector 

W/ 


«’  |  =  noi.!  ■?-ccr?ispted  observation  at  s,  0  i  s  <  I 


=  unit  vector  alorvs  z-ccoydinate 


z'.  =  random  variable  related  to  *  (s)  ,  0  •  s  v  T 
a-  =  vector  of  arbitrary  parameters 

fj  =  inverse  of  reentry-vehicle  ballistic  coefficient 
Y  =  weighting  factor  in  maximum-likelihood  estimate 
Yk(X)  ,’/d)  -  infinite  sums  related  to  ,  m^'(X)  , 
v7(X) ,u^(X)  =  finite  sums  related  to  z^  m*(X) , 

i  K  ill 

As  -  index  interval  used  in  numerical  integration 
AT  =  step  size  at  which  update  X(0;T)  in  T 
6  =  computer  execution  time  per  As  calculation 


6(s  -  u)  »  delta  function 


ro  "  ro 

ro  "  h 

eo  -  jo 

\  -  eo 


C^(s)  =  function  related  to  r^(s,u)  and  z^(u) 

Ti  =  average  number  of  Newtonian  iterations  per 
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",  =  number  of  Newtonian  iterations  at  T, 
i  i 

A  =  angle  between  zenith  (z-axis)  and  radius  vector  to 
reentry  vehicle 

Cl1  =  unit  angular  vector  in  plane  of  motion 

A  *  covariance  of  the  joint  gaussian  distribution  of  X(0) 
and  o 

U  »  mean  of  the  joint  gaussian  distribution  of  X(G)  and  cv 

v  =  AT/ As 

=  stopping  thresholds 
o  =  air  density  at  altitude 

r  =  total  computer  execution  time  for  processing  signal 
=  computer  execution  time  at 
0  =  zero  vector 

jj- 

fi.(s),\.]  =  orthonomal  eigenvectors  and  related  eigenvalues  of 
rk(s,u) 

x<X0,T)  =  bias  of  the  M1E  X(0,T)  of  XQ 

A 

V  w(X)  ,V*w(X)  =  gradient  of  w(-)  with  respect  to  m-vector  argument 

w  A 

A 

^rrw(X) ,V*«w(X)  *  second  gradient  of  w(*)  with  respect  to  m-vector 

wU  AA 

argument 

S  |X | j  =  norm  of  X,  over  interval  (0  ,T) 


I .  PROBLEM  DEFINITION 


GENERAL  PROBLEM  STATEMENT 

A  general  problem  of  much  interest  and  importance  in  prediction, 
estimation,  and  control  areas  is  th  following:  Given  an  m-vector 
physical  process 


/x  l  ( t  )\ 


X(t)  = 


V*_(t 


i 

V 


where  X(t)  satisfies  a  differential  equation 


(1) 


Mil  =  Frx;t;ol,  (2) 

and  where  n  is  a  vector  of  arbitrary  paramet  ,  (some  known,  some 
unknown) ,  given  some  .»  priori  statistics  on  the  value  assumed  by  the 
initial  condition  X(0)  ,  and  given  a  noise-corrupted  observation  over 
some  (variable)  interval  ro.Tl  of  a  p-vector  process  rented  to  X(t): 

Z(s) ,  0  t  s  s  I,  (3) 

then  it  is  desired  to  make  a  beat  estimate  of  X(t),  0  <  t ,  in  terms 

of  the  observation  (Z(s)  ,  0  <.  s  <  T)  and  the  a  priori  X(0)  statistics. 

If  some  components  of  the  parameter  vector  a  are  unknown,  then  they 

may  also  have  to  be  estimated  in  the  process. 

This  problem  has  been  essentially  solved  for  the  case  where  F(-) 

(1) 

is  linear  and  the  noise-corrupting  Z(t)  is  additive.  This  Memoran¬ 
dum  will  construct  and  apply  a  numerical  algorithm  for  solving  this 
problem  when  F( • )  is  nonlinear,  but  the  corrupting  noise  is  still 
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additive.  The  generality  of  the  above  statement  wi '  be  restricted 
somewhat  for  present  purposes,  but  the  results  can  then  be  extended 
in  obvious  ways  to  cover  the  more  general  case. 


SPECIAL  NONLINEAR  CASE  TREATED 

/*  i(t)\ 


The  vector  X(t) 


will  be  a  real  m-vector  physical 


process  satisfying  the  nonlinear,  time-invariant  differential  equation 

/fl<X;o- )\ 


(4) 


\fm  (X;"V 


where  F(X;o)  depends  upon  an  unknown  but  fixed  (throughout  the  obser¬ 
vation  interval)  vector  parameter  o:  known  parameters  are  subsumed  in 
F(  ••.<>).  Observations  of  the  p-vector  Z(s)  over  a  varying  in’Tval 
CO.T1  are  available,  where  it  is  known  that 


Z(s)  =  H(s)X(s)  +  N(s) 


0  s  <•  T, 


(5a) 


where 


H(s) 


/hl,(s)  ...  hlm(s)' 


.  h  (s)  ...  h  ( s)  / 
pi  pm  / 


(5  b) 


N(s) 


/" ils)\ 


,n  (  s ) / 

\  P  / 


(5c) 


-3~ 


and 


4,(S>\ 

z(s)  =  : 

\%  M) 


(5d) 


It  is  assumed  that  F(X;g')  and  H{s)  are  smooth  functions,  e„g.  ,  ana¬ 
lytic.  Finally,  it  is  assumed  that  the  initial  condition  on  X(s)  at 
time  zero,  X(0)  ,  and  the  values  for  o  have  a  joint  gaussian  distribu 
tion  for  which  the  distribution  mean  and  covariance  are  known: 


/X(0)\ 

EU  J-- 


X(0)\ 

a  r » 


X(0)\  1*| 

[ U)*“j 


positive  definite, 


and  that  the  noise  process  N(s)  has 


EN(s)  *  0 


and  covariance  function 


E[N(C)N  (s)1  «  R(r,s)  positive  definite 


(6) 


(7) 


(8) 


with  n^ 


independent  of  n 


r 


i  4  J ;  thus  R(C,s)  is  a  diagonal  matrix 


ErN(r)N-V(s)  1  -  R(r,s) 


(9) 


The  problem  is  then  to  construct  a  best  estimate  of  a  and  X(t) ,  0  £  t, 
given  f Z ( a )  ,  0  s  s  £  T|. 


HISTORY  OF  THE  PROBLEM 

This  problem  Intersects  related  research  areas  which  have  been  under 
investigation  for  several  decades:  optimum  filtering,  control  theory, 
and  prediction  and  estimation.  An  early  form  was  the  Wiener  filtering 
problem,  now  largely  solved  for  linear  F(-)  and  additive  gauss ian  noise. 
In  the  linear  case  the  optimality  criteria  of  minimum  RMS  error,  as  well 
as  of  maximum  likelihood,  were  successfully  pursued.  It  is  well  known 
that  in  all  cases  (linear  or  not)  the  minimum  RMS  estimator  is  given  by 
the  conditional  expectation  of  X(0)  ,  given  Z(s)  ,  0  £  s  <;  T.  In  the  lin¬ 
ear  case,  it  is  also  well  known  that  the  conditional  expectation  and  the 
maximum-likelihood  estimate  (given  Z(s)  ,  ^  s,  s  <:  T)  coincide. 

Only  in  the  simplest  linear  cases  is  it  possible  to  derive  a  simple 
closed  expression  for  the  conditional  expectation.  Rather,  it  is  often 
necessary  ‘o  derive  the  differential  equation  which  the  estimator  satis¬ 
fies  as  a  function  of  t  and  T,  and  then  to  solve  this  differential  equa¬ 
tion  as  T  increases.  This  is  done  for  the  linear  case  in  Ref.  1  by 
Kalman  and  Bucy. 

In  the  nonlinear  case,  it  Is  natural  to  try  to  extend  the  results 
for  the  conditional  expectation,  since  it  satisfies  the  intuitively 
appealing  minimum  KMS  criterion.  The  increased  complexity  of  the  non¬ 
linear  case  forces  use  of  the  d i f f erent la  1 -equat ion  approach,  via  which 
the  estimator  is  calculated  numerically  as  T  increases.  However,  the 
max imum- 1 ikel ihood  estimator  no  longer  coincides  in  general  with  the 
conditional  expectation.  Since  it  has  certain  computational  advantages 
ovei  the  conditional  expectation,  the  max  num- 1  ike  1 ihood  er'imator  has 
also  been  explored  via  the  d  i  £  f erent  ia  1 -equa t.  i on  approach.  Both 
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approaches  (conditional  expectati —  and  maximum -likelihood  estimator) 
involve  the  conditional-probability  function  of  X(t),  given  Z(s) , 

0  i  s  s  T.  It,  too,  can  be  approached  via  the  differential  equation, 
which  it  must  satisfy  as  a  function  of  t  and  T.  This  is  related  to  the 
differential  equation  which  the  unconditional-probability  function  of 
X(t)  must  satisfy.  See  Ref.  2  for  a  discussion  of  these  so-called 
diffusion  equations. 

An  early  attempt  to  derive  such  differential  equations  for  the 
conditional -probability  function  is  siven  in  Ref.  3  by  Stratonovich . 

The  continuous-time  results  were  incorrect  but  have  been  subsequently 
derived  correctly,  as  noted  below.  An  interesting  and  useful  early 
effort  is  described  in  Ref.  4,  by  W.  M,  Wonham;  it  addresses  the  dis¬ 
crete  case  primarily  and  suggests  a  heuristic  extension  to  continuous 
observations . 

A  natural  way  to  extend  the  earlier  linear  results  Is  to  general¬ 
ise  the  filters  from  linear  weighting  coefficients  of  the  observation 
®°re  complicated  functionals  „f  the  observat i~n.  In  Ref.  3, 
Balakrishnan  develops  the  theory  of  functionals  upon  the  space  of  ran¬ 
dan  observations,  shows  that  functionals  useful  for  oui  problem  can  be 
approximated  by  certain  kinds  of  polynomials,  and  then  shows  how  the 
method  of  steepest  descent  can  be  used  to  construct  a  polynomial 
approximation  to  a  minimum  RMS  eatlmator. 

In  Refs.  6  and  7  ,  Kushner  derives  correct  differential  equations 
which  muat  be  satisfied  as  functions  of  t  by  the  conditional -probab i 1 i t y 
function.  Kushner  there  suggests  approximating  the  optimal  filter, 
which  is  infinite  dimensional  in  the  sense  of  requiring  specification 
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of  all  moments  of  tne  cond i L ional -probabi l i ty  function,  by  a  finite 
f i lter--i .e .  ,  by  ig-^ring  all  but  a  fin4te  number  of  moments.  Bucy  in 
Ref.  8  develops  these  differential-equation  results  of  Kus'nner's  also. 

Another  natural  way  to  extend  linear  iterative  resu’ts  to  non¬ 
linear  problems  is  to  try  to  extend  the  weighting  techniques  straight¬ 
forwardly  to  approximate  optimum  nonlinear  filters.  This  is  done  i; 

Ref.  9  by  Mowery  for  the  discrete-observation  case;  he  displays  some 
interesting  error  calculations  for  the  approximate  estimators  developed. 
Another  aporoach  is  that  of  invariant  imbedding;  in  Ref.  10,  E: liman 
et  al,  derive  a  numerical  (computer)  technique  for  finding  X(t)  which 
minimizes  the  usual  quadratic  norm  over  (0,T)  of  the  difference  between 
X(s) ,  0  *  s  <  T  and  the  observation.  Several  examples  are  calculated 
out  to  show  convergence  performance  of  the  estimation. 

In  Ref.  11,  Friedland  and  Bernstein  derive  differential  equations 
for  a  first-order  approximation  to  the  max imum- 1  ike  1  ihood  estimator  for 
the  discrete  sample  case  and  then  derive  an  analogous  first-order 
approximation  for  the  continuous -time  case.  In  Ref.  12,  Bass  et  al. 
derive  an  approximation  for  the  conditional  expectation.  Specifically, 
by  neglecting  higher-order  terms  in  the  noise  end  in  error  differences, 
differential  equations  governing  the  evolution  with  T  of  the  condi¬ 
tional  expectation  are  derived;  the  approx imat ion  for  the  RMS  error  (rel¬ 
ative  to  conditional  expectation)  matrix  is  similarly  derived.  These 
two  differential  equations  theoretically  could  be  jointly  solved  numeri¬ 
cally  to  c’lculate  the  approximated  conditional  ex  pec t at  ion . 

Various  calculations  have  been  made  in  order  to  compare  the  effi¬ 
ciency  of  the  various  possible  estimators  in  very  special  cases.  One 
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such  calculation  by  Carney  and  Goldwyi.  is  described  in  Ref.  13.  There, 
a  (scalar!  nonlinear  estimation  problem  relative  to  a  linear  dynamic 
system  is  treated  by  estimators  based  on  1  eas t- squares  and  maximum- 
likelihood  criteria;  in  a  linearized  form  of  the  problem,  two  versions 
of  the  Kalman-Buoy  estimator  are  used.  Monte  Carlo  techniques  are  used 
to  generate  comparative  error  statistics.  In  all  cases,  and  over  wide 
ranges  of  governing  parameters,  the  max imum- 1  ike  1 ihood  estimate  is 
preferred.  Kushner  p.oposes  in  an  exploratory  way  in  Ref.  14  possibly 
useful  finite  filters,  derived  either  by  assuming  specific  forms  for 
the  conditional -probability  function  or  by  similar  assumptions  on  a 
finite  "umber  of  the  conditional  moments.  Several  special  results  are 
then  calculated. 

In  Ref.  33  Kushner  derives  the  exact  differentia!  equation  which 
toe  conditional  mode  at  time  1  must  satisfy.  However,  solution  of 
this  equation  is  not  possible,  since  It  involves  higher  derivatives  of 
all  orders  of  the  conditional  probability  distribution.  Recourse  would 
therefore  have  to  be  made  to  approximations.  However ,  the  results  of 
Carney  and  Coldwyn  (Ref.  13)  suggest  that  the  max imun- 1  ike  1 ihood  esti¬ 
mate  for  the  initial  condition  is  preferred  to  the  various  approxima¬ 
tions  tried  then*  and  there  i  m*  that  an  exact  maximum-!  ikei  ilioo-d  esti¬ 
mate  of  the  initial  condition  would  he  preferable  to  approx  i  ma t ions  to 
either  the  conditional  expectation  or  the  max imum- 1  Ike  1 ihood  estimator. 

The  numerical  comparisons  made  in  Ref.  13  and  elsewhere  are  for 
the  scalar  caee,  in  order  to  render  the  required  work  load  acceptable. 
This  Memorandum  is  addressed  to  the  vector  situation,  to  examine  the 
feasibility  of  numerically  solving  th"  vector  differential  equations 


-8- 


satisfied  by  the  max  irnurn- 1  ike  1  ihood  estimator.  The  calculations 
'equired  are  sufficiently  large  that  it  is  not  possible  to  gep~'",te 
error  statistics  allowing  comparison  with  other  techniques.  Rather, 
attention  is  focused  upon  a  specific  observation  samr  le ,  and  the 
questions  of  computational  feasibility,  convergence,  and  solution 
usefulness  are  examined. 


.0  - 


II.  NUMERICAL  TECHNIQUE 

In  general,  the  components  of  X(0)  can.  be  expanded  to  include  O', 

d  '"v 

.u  those  of  F(X;i)  to  include  v(t)  =  'v(O)  =  ->  (i.e.  »  =  >  an‘l  •' 

herefore  can  be  estimated  as  part  .  generalized  X(0).  This  is 
assumed  done  in  the  following,  ir  which  we  ave  set  F  (X ;  o)  =  F  (X)  . 

CONDITIONAL  EXPECTATION 

As  discussed  in  Section  I,  a  desirable  criterion  for  a  ’best 
estimate  of  X(t),  0  *  *  .  in  terms  of  Z(s),  0  t;  s  <  T } ,  is  that  tunc- 

A 

tion  X(t)  which  is  unbiased  and  minimizes 

A  -k  A 

E  f  X  ( 1 1  -  X  ( t )  'X(t)  -  Xt,t)  ;  •  (10) 

It  is  well  known  that  the  conditional  expectation  of  X(t),  given 
■  1’. (s)  ,  0  v  $  <  Tl,  minimizes  (10)  : 

x(t)  -  E'xtt)  !z(») ,  o  *.  s  s  r" 

*  E-X(tl  given  7.  (si,  0  s  •  I  .  (  1  H 

In  the  vcc'or -  linear  situation,  X(t>  as  defined  bv  till  can  be  calcu¬ 
lated  (see  Ret.  If,  pp  .  3  - 1>0  -  3-62 ;  also  Ref  It.  However,  t.r  the  ve 
tor  nonlinear  situation  under  cons iderat  icn  here,  such  u.  not  the  case. 

MAXIMUM- L IKELIBOOD  ESTIMATE 

At!  estimate  alternate  to  that  which  minimizes  i  10)  is  the  so- 
called  max  imum- 1  ike  1  ih.ood  estimate  (MLS.)  of  X(tl:  the  vi  cap  X(  t  i 
which,  given  Z (*>,  0  •-  s  '•  T,  has  the  maximum  conditional  probability 
cf  occurrence  p(X(r);Z(s>,  0  •  s  *>  T) .  This  may  be  developed  as  shown 
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■ X 


L  >y^. 
I*’,  v 
|,V:'-N  .  ; 


■«%*  \  «*’""■ 


r  k-,2 
Z  . 

I 

k  0 

=  1  co 


where  z,  and  m.(C)  are  defined  by 


zV;  =  f  z,(s)  ^(s'ids, 

i  ^  K  1 

”'o 


k,„,  ki„- 

m  .  ( C )  =  E  z.ji. 


r  k 

=  •  H  (s)X(s ;C) •  (s)ds , 

Jo  k 


•/.  •  -  I  I 


k  k 

and  !  - i ( s )  »  "  ar£  t^ie  ortrionormal  eigenvectors  and  related  eigen- 

K  k 

values  of  r.  (s,u);  that  is,  f  >V .  ( s )  ,  '>..1  satisfy 

K  1  1 


/  r  (s  ;u)  •;  (u)  du 

j ,  k  1 

u 


,k  k,  . 
li?i<s) 


Equation  (12)  simplifies  greatly  when  the  noise  N(s)  is  such  as  to 

permit  its  development  as  in  Appendix  A  and  the  r,  (s,u)  can  be  treated 

°  k 

as  delta  functions: 


rk(s,u)  =  ^6(5  -  u)  ,  k  =  1 ,  . . . ,  p. 


Then  (12)  becomes 


p(c|z(s)  ,  0  <  s  <  T)  =  —  , 

/  J (C) dC 


(13a) 


I 
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where 


J (C)  =  exp  |-  |  | (C  -  u.)*A_i(C  -  u) 

L 


+  t  rz 

•* 


_*  -i 


(a)  -  H(s)X(s;C>]  R'^Zis)  -  K(s)X(s  ;C)  ]ds 


A,  ' 

I  1  o 

R  = 


f  13b) 


\°  '*>) 


(13c) 


Equation  (12)  can  now  be  used  to  write  a  general  expression 
for  the  MLE .  However,  in  order  to  reduce  the  computing  complex¬ 
ities,  the  MLE  will  be  derived  and  applied  for  Eq .  (13),  the  spe¬ 
cialization  of  (12)  to  the  case  of  white  noise.  This  is  an  important 
case  in  Its  own  right  and  can  be  generalized  straightforwardly. 

Since  the  denominator  of  (13),  for  given  Z(s)  ,  0  s  s  «  T,  is 
a  constant,  the  MLE  is  that  C  which  maximizes  the  numerator,  which 
is  clearly  that  C  whicn  minimizes 


i(C,T)  =  (C  -  u)*A_1(C  -  n) 


+  /  fZ(s)  -  H(s)X(s;C)]*R"1rZ(s)  -  H(s)X(s ;C) Ids 
0 


L 

L 


(14) 


Direct  minimization  of  (14)  for  fixed  T  is  in  general  impos¬ 
sible  because  the  available  numerical  techniques  require  an  Initial - 

a  * 

estimate  C  close  enough  to  the  sought  X(T)  to  guarantee  convergence. 
For  arbitrary  T  this  is  not  available.  Rather,  an  Ir^rsLlve  numerical 


X(T)  may  or  may  not  equal  XQ,  the  true  initial  condition. 
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procedure  which  builds  upon  preceding  estimates  for  increasing  values 
of  T  must  be  devised.  This  can  be  done  by  developing  the  differential 
equation  which  the  best  estimate  must  satisfy  as  a  function  of  time  T, 
the  upper  limit  of  the  interval  of  observation.  As  indicated  below, 
the  MLE  of  X't),  0  <  t,  follows  naturally  from  the  MLE  of  X(0)  ,  sc 
the  lattei  will  be  estimated. 

Let  us  assume  that  for  each  T  over  a  range  of  T  beginning  at 
T  =  0  there  does  exist  a  unique  value  of  the  vector  C,  X(T)  ,  which 

A 

minimizes  (14).  Then  X(T)  must  be  the  solution,  for  each  T,  of 

T 

Vcf(C,T)  =  2A'1(C  -  u)  -  2  f  VcX*(s;C)H*(8)RJirZ(s) 

JQ 

-  H(s)X(s ;C) Ids  -  0.  (15) 

A 

That,  is,  X(T)  must  satiety 


X(T)  -  u  +  A 


/ 


^X*  (s;X)H*(s)rVz(s) 


-  H(s)X(s;X) Ids 


(16) 


But  if  X(T)  satisfies  (16)  over  the  entire  assumed  T  range,  then  it 
satisfies  there 


dX 

dT 


JX  dT 
AT  dT 


3T  ■ 


(17) 


It  is  useful  to  write  (17)  as  shown,  because  it  explicitly  iden- 
tifies  the  variation  in  X(T)  due  to  variation  in  T  and  that  due  to 
variation  in  X,  However,  because  of  the  white-noise  component  In  Z(s), 
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(17)  can  only  be  viewed  as  u  symbolic-  representation  of  a  corresponding 
differential  expression: 

die  =  A7tX*(T,X)H*(T)R'1H(T)rX(T;X0)  -  X(T;X)]dT 
+  /VcX*(T;X)!*(T)R"ldL(T)  +  7cX(T)dX  , 

where 


L(s) 


/'  i<»>\ 

\vv 


/  a  /  \ 

and  the  !  (s)  are  appropriate  independent  scalar  Wiener  processes.  ’ 

In  the  following  the  convenient  symbolism  of  (17)  will  be  used.  The 
results  are  correct  as  long  as  only  differential  expressions  are  used 
in  the  calculations;  this  is  the  case.  Of  course,  a  true  differential 
equation  analogous  to  (17)  does  apply  in  the  co lored-no ise  case.  How¬ 
ever,  computations  then  are  more  difficult  because  of  the  need  to  cal- 
culate  g  ( s  ; C )  of  Eq.  (12)  as  T  and  C  change. 

Equation  (17)  is  ennivalent  to 


~  (IX  \Y  ★  ^  }  * 

fl  -  7cX(T)  --  ~~  -  '  7CX  (T;X)H  (T)R  rZ(T)  -  H(T)X(T  ,X)  1 .  (18) 


F  r  om  (lb), 


7CX(T)  *  ■  |  r'ccX*(s;X)l*(s)R'lrZ(s)  -  h(s)X(s  ;X)  ’ds 

T 

-  '  j  "cX*(s  ;X)H,'(s)R'1H(s)' rX(s  ;X)d.s  , 


I  -  ?cX(T) 


T 

_1  /*  *  '  *  .] 

•  +  j  7JC  (s;X)H  ( s) R  lH( 

*  n 


s)^  X(s;X)ds 


„X*(s  ;X)H*(s)R'irZ(s)  -  H(s)X(s ;X) ’ds 


But  from  (15), 


,f(X,T)  =  2 


',<2/ 


V  X*(s;X)H''(s)R'1H(s)r,  X(s;X)ds 


1 

-2  f,/(s;X);(s,R-4(s)  -  H(s)X(s;X)  ds. 


and  thus 


I  -  ~CX(T)  =  j  '  ccf(X,T)  , 


so  that  (18)  becomes 


dX  *  -  *  .  1  r  -  , 

CCf(X’T)  Jj  =  2  "CX  (T  ;X)H  (T)R  Z(T)  -  H(T)X(T;X)  . 


Expression  (20)  is  the  fundamental  differential  equation  of 
interest;  and  if  ^^(X.T)  is  nonainguiar,  it  can  be  written  as 


dX  r„  *  -  1  *  «  *  -  1 

dT  "  2~  CCUX,T)  V  (T;X)H  (T)R  fz(T)  *  H  ( T )  X  ( T  ;  X  )  1 .  (21) 


It  was  assumed  above  that  there  did  exist  a  unique  solution  to  (15) 
which  minimized  (141  over  some  T  interval  beginning  at  T  =  0.  A  nec¬ 


essary  and  sufficient  condition  for  this  is  that  V  f(C,T)  be  positiv 

(a 


definite  at  C  =  X  (for  constant  T)  ,  which  requires  that  ?  f(C,T)  be 

nonsingular  in  that  inte:  al.  Therefore,  under  this  assumption,  (20) 

* 

jjr 

can  be  solved  for  ln  the  form  of  (21)- 
Further,  at  T  =  0,  (16)  and  (19)  imply 

X(0)  =n  and  '7CCf(^’0)  '  2A"1*  (22) 

A 

Thus  at  T  =  0  we  are  guaranteed  the  nonsingularity  of  7 f(X,T);  the 
legitimacy  of  the  form  (21);  and  the  initial  conditions  with  which  to 
start  a  numerical  s  lution: 

X(0)  =  u 

=  A  H*(0)R'^Z(0)  -  H(0)u1,  (23) 

T-0 

since  7^X(0;X)  =  I. 

A 

Furthermore,  (22)  implies  that  7^r(X,T)  is  nonsingular  in  some 
finite  T  neighborhood  Co  the  right  of  T  *  0;  since 
T 

f  7^X*(s;x)H*(s)R'1H(s)ricX(s;X)d8 

'  0 

T 

7ccX*(s  ;X)H*(*)R'lrZ(s)  -  H(s)X(s  ;X)  "ds  (24) 

is  a  continuous  function  of  T,  ic  follows  from  (19)  that  7^f(x,T) 
must  be  nonsin6ular  over  some  T  neighborhood  of  I  *  0.  Finally, 

V(X  ,0)  *  2 A  *  ,  so  that  A  hi  ing  positive  definite  makes  f'C.T) 
convex  at  (C,T)  *  (u,C).  Again,  the  continuity  in  T  of  (24)  guaran¬ 
tees  that  7^f(X,T)  will  he  convex  in  some  T  neighborhood  of  T  *  0, 


dX 

dT 
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Thc  exlitence  of  a  minimising  solution  to  V^f(C,T)  =  0  has  been 
shown  in  Ref.  17  to  exist  as  per  the  followii^: 

If 

T 

9 

X  (s)F(X(s) ,s)ds  £  K(T) ( 1  +  ||x||  ) 

*0 

for  each  T,  where  X(s)  =  F(X(a) ,s)  and  ]|x|j  is  the  norm  of  X(s)  over 

A 

(0,T),  then  for  each  T  there  exists  a  minimizing  solution  X(T) ,  and  at 
each  such  T  the  matrix  V^f(X,T)  is  positive  definite.  This  condition 
is  widely  met.  Summarizing  the  foregoing,  we  have: 

THEOREM.  Given  A  and  R  positive  definite  and  that  for  all  T 
of  interest 


f  X*(s)F(X(s>  ,8)ds  <.  K(T)(i  +  ||x||2), 


then  there  exists  a  solution  X(T)  to 


"  f<X,T)  -  2'"i(X  -  u) 

T 

-  2  j  (s;X)H  (»)R*1rZ(»)  -  H( r )X ( s  ;X) ^ds  *  9  (25) 

'0 

which  minimizes 


f(C.T)  -  (C  -  u)  *•  (C  -  u) 


T 

r  r. 


J 


(s)  -  Hi.  *)X(  s)  ;C)  R  ?.(*)  -  H(s)X(s;C)  ds. 


(26) 
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Further,  7  f(X,T)  is  positive  definite,  and  X(T)  se*  isfies  the  differ- 
ential  equation 

dX  A  1  ★  ^  ★  1  a. 

—  =  2r7cct'X  7  X  (T,’:)H  (T)R~  rZ(T)  -  H(T)X(T;X)’  (27) 


with  initial  conditions  at  T  -  0: 


XvO)  = 


dX 

dT 


H*(0)R'lrZ(0)  -  H ( 0 ) a 


(28a) 

(28b) 


T -0 


The  usefulness  of  these  results  in  calculating  the  MLF.  for  X(0) 
wiil  depend  upon 

1.  The  ease  with  which  a  numerical  solution  of  (77)  can  he  im- 
p 1 emen  ced , 

2.  The  integration  interval  which  must  be  used  in  order  to 
guarantee  continuation  in  the  neighborhood  of  he  desired 
so  1  u  t  i  on  , 

> .  Die  nature  of  the  solution  to  which  X (T)  converges  with 
inc  rea.s  in"  i  , 

■>  .  Hie  rate  at  which  convergence  of  X  ( T)  occurs, 

These  depend,  natural  lv.  upon  R  md  in  a.  critical  wav. 

Since  t he  so  1 u t i on  o i 


dX 


F(X> 


cor  respond { ng  to  a  given  X  ,  0  )  is  unique,  then  on>  m»  >:i»~edi  » t  «■  1  v 
extend  a  MLE  for  X ( 0 >  to  a  MLE  for  Xt  t '  ,  t  <  0.  For  in  that  vase 
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» 


it  is  natural  to  take  as  the  MLE  of  X(t),  given  X(0), 

X(t;T)  =  X(t;Xr.);T’)  ,  (29) 

where  to  show  the  dependence  upon  both  t  and  T  we  have  written  X ( t ; T ) 
MLE  of  X ( t ) . 

It  therefore  suffices  to  estimate  X(0)  and  then  numeric  illy  inte¬ 
grate 


^  -  F(X) 
dt 


(30) 


from  this  initial  condition  to  find  the  MLE  of  X(t).  This  is  th. 
approach  assumed  in  the  application.  However,  it  is  of  interest  to 
develop  the  differential  equation  which  X(t ;T)  satisfies. 

Bv  u'9) 

X(t  ;T)  *  X(t ,Xr0;T') . 

where  X  (0  ;T)  =  MLE  of  Xf0>  given  7.1s)  ,  0  •  s  •  T. 

Tb.en 


:0C(  t  ,T)  v,  vX(t  ; T  ) 
dT  *"  •! 


_X(  i  ;  X  C 


4^ 

dT 


|  F(X'l') 


FtX)  ,  t  *  T 


20- 


and  (27)  then  Imply 


27rX(t,Xr0;T1)r7ccf(ir0;T^ \T)7'l7rX*(T;xrO;T?H"( 


r"UZ(T)  -  H(T)X(T;X"0;T1)  ■ 


for  t  *  T 


dX(f.J)  m  I 
dT  | 


and 


F’X(T  ;Xr0;T’’) 1  +  2"  X (T  ;X’0  .T1) 


r  ;ccu^0:T  ,T)  ll?cx*(T;*r°;Tl) 

H*(T)K~lrZ(T)  -  H(T)X(T;Xr0  ;  T  3 ) 
for  t  =  T. 


Equation  (31)  nay  be  rewritten  as 


27^  (t;T) " '?cc  f  (X rC ; T  \T)  1 ' 1 7  x* ,  T;T>H*(T>R 
'Z(T)  -  H(T^X(T;T) ’ 
for  t  '  T 


«iL ill  .  J 
«  1 


and 


F'X(T;T>  '  -  27  X(T;T)r7  f(X'0;T\T)  ’* ‘7  X*(T  ;T)k'*’ 
'Z(T)  -  H(T)X(T;T)  ' 


^  foi  t  -  T. 


In  the  ahovt  ,  X(t;T)  ha*  been  used  to  indicate  dependence  upon 
note  that  X(T'  *  XfO.Tl,  from  earlier  nomenclature .  We  no.  tin 
use  X  (T 1  in  the  foil  owl  nit. 


(3 


(T>R 


t  and 
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III.  ERROR  STATISTICS 


A  lower  bound  for  the  conditional -error  statistics  of  the  MLE  de  ¬ 
veloped  in  Section  II  will  be  derived  - -spec i f leal  1 y ,  the  unbiased  Cramer 
Rao  bound.  It  has  not  been  possible  to  derive  an  exact  error  estimate  . 
or  an  upper  bound  which  is  useful  When  T  is  large,  the  estimate  CjiI 
be  expected  to  become  more  md  more  independent  of  the  a  priori  matrix 
The  bound  developed  here  does  not  involve  and  could  only  apply  for 
large  T.  As  such,  it  is  a  useful  bound  with  which  to  compare  conver¬ 
gence  of  the  estimator  for  large  T. 

Defining  the  bias  of  the  MLE  X ( T >  of  as 

i(X0,T)  =  F ( X ( T )  -  >  XI ,  (331 

then  a  direct  generalization  of  the  r «  suits  of  Ref.  It  ipp.  3-2  -  3-S) 
to  the  continuous  sample  tmu  case  gives 


c  (X .  ,'T’)  •-  Trace  E-r'X(T>  -  X  '"X(T)  -  X  "*'x, 
u  t!  0  o 


>  Trace*  "I  *  ,T>  '  'I  *  VcfXn  ,T!  V  1  \ 


C  O’ 


where  the  In  ten  mat  ion  m.  tr- 


C  «•  F.  f  T  ;0g  p(2(s) ,  0  <  s  s  TiX-I  -.'T  lc,x  ?(*(«>.  0  *•  *  Tix,,)'  ix 

v.  0  c  U  '  U 


(  3  U  e  1 

Tills  is  the  Cramer  *  Rao  bound ;  and  that  ;  3-.)  holds  for  tin*  eonClnuoua 


sample  time  ca -c  follows  irom  s  limiting  development  analogous  tv  that 
in  Appendix  A.  Directly  from  Append! ..  A  t  >•»  1  •  evg  the  expression 


P(Z(s)  ,  0  <■  s  ■  T  [X  ) 


r  -  . 

K  exp  4  j  'z(s) 

l  “  •'n 


H  ( s ) X ( s ;X„> '*R‘‘ 


^  ( s)  -  H(.s)X(s;X  )'ds 


(35) 


where  K  is  an  appropriate  normalizing  Ci  .start. 


We  tiien  have 


T  T 


^  =  1  f  V  (s;X0)H*'s)R’lrH(s)X(s:Xn) 

’0  '0 


N(s)  -  H(s)X(s;X0) 


H(u)y(U;X())  4  N(u)  -  H(u)X(u;X  rYVu)?  X(u;Xrt)d 


r,u,^0;ds  dujXQ 


T  T 

I  1  c 


I  7fx  (s;X0)H  (c’io*1dw.  Nr,-1 


0  0 


S)R  R5<s  -  “>R*  H(u)7cX(U;X0)ds  du, 


nr 


(36) 


G  = 


Z  f  7CX^s:X0)H*<s>R'iK(s)7cX(s-X0)ds. 


(37) 


Substituting  (37)  into  (3d)  gives 


e  (X0,T)  >  4  Trace  <  rI  4  V 


cx<X0>T)l  fl  +  7  a<X  T)1 


L  0 


V  (s;X0)H*(.-t)R'i!!(s)7cX(s;X0)ds 


C'O 
1 


>  - 


(38) 


A  singularity  does  not  occur  in  (38)  at  T  =  0,  since  _  T  =  0  the 
-ocfficient  1 1  +  vc't'(X0,T)]  is  zero.  For,  from  (16)  the  MLE  X(T) 
satisfies  (suppressing  f  -  T  in  Xfl]  in  the  right-hand  side) 


XiT)  =  u  + 


f 


:X)H  (s)R  irZ(s)  -  H(s)X(s;X)  ds. 


Subtracting  X^  from  (39)  and  taking  the  expectation  gives 


./Y  T' 
■"  O’  ' 


X0  +  .-.E<  j  ~cX*(s;X)H*(s)R‘lrZ(s)  -  H(s)X(s;X)  Ids |X0j« , 


from  which 


,x<X„,T)  =  -  I  +  AE  <  /  T  _X  (s ;X)H  (s)R'lfZ's) 


,  1 

-  H(s)X(s;X>1ds|X0J- 


f  J  ^ 

AE J  j  7cX*(s;X)H*(s)R'1H(s)7cX(s:X)ds!Xc  f  , 


so  that 


i  +  v<VT) 


=  \e|  j*  7CJ.X*(s;X)Hir(8)R‘1[z(8)  -  H(s)X(s  ;X)  ]ds  |XQj 


-  Ae|J  VcX*(s;X)H*f6)R'1H(s)VcX(s;X)d8|X0|  ,  (42) 


which  is  zero  for  T  =  0. 


The  expression  (42)  for  I  +  ^  us(Xq,T)  has  not  been  evaluated 
exactly.  Using  gross  eliminations,  we  heuristi cal ly  show  that 
Vcoa0,T)  decreases  monotonical ly  as  T  increases.  This  amounts  to 
linearizing  the  error  expressions,  so  that  the  resulting  expressions 
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are  correct  for  the  linear  case  but  are  only  approximate  in  the  non- 

A  A 

linear  situation.  Assuming  that  X(s;X)  and  7^X(q;X)  can  be  expanded 
about  Xq  in  power  series,  we  have 

X(s  ;X)  =  X ( s  ;XQ)  +  ^(s^HX  -  XQ) 

+  jr  (X  "  X0)*?ccX(s;X0)(X  -  XQ)  +  ...  ,  (43) 

and 

V<s  *  '  V(*!V  +  ,'ccx(’;Xo)(* '  V 

+  jr  (X  -  X0)*"CCCX(»:X0)(X  -  x0)  +  ...  .  (44) 

Substituting  (43)  and  (44)  into  (39)  and  linearizing  the  error 

A 

expressions  by  dropping  all  terms  in  (X  -  XQ)  higher  than  the  first 

A  J-J 

and  in  (X  -  Xq)  and  N(s)  where  n  >  1,  there  results 

T 

X  -  XQ  =  -  (XQ  -  u)  -  A  f  7cx*(6;X0)H*(slR'1H(s)VcX(s;X0)ds(X  -  XQ) 

•'0 

T 

+  A  f  Vcx*(s;X0)H\s)R'1N(s)ds. 

A 

Factoring  out  and  solving  for  X  -  Xq  gives  (45)  below,  which  i3  exact 
for  the  linear  case:  thi  fact  can  be  shown  simply  by  substituting 
Z(s;Xg)  for  the  linear  case  into  (39),  subtracting  Xq  from  both  sides, 

A 

and  solving  fox  X  -  Xq. 
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X 


T 

A'1  +  |  VcX*(fi;X0)H*(.)R'1H(s)VcX(s;X0)d6 

Jo 


T 

-  A-1(Xq  -  u )  +  f  '7cX*(8;X0)H*(s)R‘1N(8)ds 

J0 


(45) 


Thus  for  large  T 


w(X0,T)  =  E[X(T)  -  XQ  i XQ 1  =  B^X^T) 


A~^(Xq  -  u)  + 


x 

f  V*(s;X( 

o'  r\ 


)K*(8)R'1ErN(s) "ds 


or 


o)(X0,T)  =  -  B~l(X0 ,T) A*1  (XQ  -  u), 


( 46a) 


where 


T 

B(X0,T)  =  A*1  +  f  VcX*(s;X0)H*(8)R'1H(s)VcX(8;X0)ds.  /46b) 

*0 

Equation  (46' ,  arrived  at  via  the  preceding  lii  .arization,  can  be 
u.  ith  (38)  to  estimate  the  biased  Cramer-i’ac.  bout'id.  However, 

B(Xq ,T)  can  be  written  as 
T 

B(Xq,T)  -  A'1  +  f  [R'^H(8)VcX(,;X0)l*rR‘Si(«)VcX(s;X0) Ids,  (47) 

*o 


*  ich  is  the  sum  of  two  squares  and  therefore  is  monotonically 
increasing.  If  its  limit  as  T  increases  is  sufficiently  large,  then 
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8'  l(X  T)  ->  e(X  )  =  0.  (48) 


For  such  and  for  I 

0 

arge  T  the 

inequality  (3Q1  becomes 

■\ 

! 

r  I 

-i 

i-M 

1 

>  f  * 

*  - 1 

;  I 

c  (X0,T)  >  4  Trace  ^ 

J  V  (s 

;X0)K  (s)R  H(s)7  x(s;X0)ds 

\ 

i 

V 

w  0 

- 

J 

which  is  the  unbiased  Cramer -Rao  bound.  Trie  lower  bound  given  in  the 
right  side  of  (49)  Is  probabLy  too  small,  in  view  of  the  assumptions 
employed  in  its  heuristic  development,  and  because  (48)  will  not 
always  be  satisfied.  In  fact,  comparing  (46b)  and  (37),  it  is  evident 
that  (except  for  very  large  A  S  B  ^(X^,T)  very  small  implies  that  G 
is  also  very  small;  i.e.,  the  unbiased  Cramer-Rao  bound  is  then  also 
essentially  zet  1.  When  (48)  is  not  satisfied  it  is  possible  to 
approximate  the  biased  Cramer-Rao  bound  using  the  gradient  of  (ju(X^,T) 
as  found  from  (4oa).  However,  the  ^ound  given  by  (49''  will  be  dis¬ 
played  later  for  the  application  being  made.  As  mentioned  earlier, 
this  result  could  only  apply  for  large  T,  when  it  is  expected  that  the 
estim;  e  becomes  independent  c!  A,  which  does  not  appear  in  (49).  In 
i  any  case,  the  unbiased  Cramer-Rao  bound  Is  an  interesting  standard 

'  with  which  to  compare  the  estimate. 

j| 

i 


i 

j 

1 

I 

i 

\ 
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IV.  AN  APPLICATION:  REENTRY  PREDICTION 

GENERAL 

As  an  example  application  of  the  foregoing  theoretical  results, 
they  will  be  used  to  estimate  the  initial  conditions  of  a  reentry 
vehicle  (RV)  which  has  been  detected  and  is  being  tracked.  It  is 
assumed  that  a  priori  information  is  available  relative  to  the  space 
point  at  which  the  RV  will  be  detected  and  to  its  velocity  at  that  point 
In  the  particular  case  at  hand,  the  acquisition  and  tracking  take 
place  during  the  last  5  seconds  before  impact,  so  that  it  is  natural 
to  think  of  these  a  priori  data  as  having  be^n  supplied  by  a  long-range 
surveillance  radar  for  purposes  of  acquisition  and  track  initiation. 

In  such  a  situation,  it  is  natural  to  ask  how  to  use  these  a  prior,', 
data  foi lowing  acquisition,  in  order  to  make  the  best  estimates  and 
predictions  during  the  tracking  phase. 

Depending  upon  the  circumstances,  it  may  be  necessary  to  estimate 
certain  RV  parameters,  either  initially  or  continuously,  in  order  to 
predict  its  reentry  path.  In  particular,  the  RV  lift  and  drag  coeffi¬ 
cients  are  often  poorly  known  constants  or  functions  of  time.  These, 
and  other  unknown  parameters,  could  be  estimated  within  the  framework 
of  the  preceding  theory,  simply  by  including  them  as  extra  components 
of  the  vector  X(t)  and  including  components  describing  their  varia¬ 
tion  with  time  in  F(X) .  Similar  remarks  apply  to  certain  variable 
environmental  factors,  such  as  air  density  at  the  reference  altitude, 
which  vary  continuously.  However,  inclusion  of  these  extra  parameters 
in  the  estimation  example  greatly  increases  the  complexity  and  diffi¬ 
culty  of  the  computing  task  without  adding  very  much  to  the  usefulness 
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of  the  example.  Therefore,  the  estimation  procedure  will  assume  that 
all  RV  and  environmental  parameters  are  known  accurately. 

Similar  reductions  in  computing  complexity  can  be  made  by  restrict¬ 
ing  the  RV  motion  to  a  we  1 1 -known  vertical  plane  containing  the  tracking 
rid«r.  Estimatlor  and  prediction  within  just  that  plane  is  still  a 
sufficiently  rich  problem  to  display  the  technique.  Also  RV  character¬ 
istics  are  assumed  to  cause  only  drag  acce lerations ;  no  lift  or  deflec¬ 
tion  forces  operate  (the  latter  is  ruled  out  by  the  pi anar-moticn  as¬ 
sumption).  Finally,  for  simplicity  a  flat  earth  will  be  assumed,  as 
well  as  a  force  of  gravity  that  is  constant  throughout  the  altitudes 
concerned  and  directed  along  the  radar  vertical.  In  the  case  at  hand 
these  latter  are  both  acceptable  assumptions  because  the  ground  range 
and  altitude  variations  are  so  slight.  However,  ever,  if  they  were  not 
slight  an  estimation  example  which  treated  them  as  constant  would  he 
as  useful  as  if  they  were  otherwise  treated,  as  long  as  the  equations 
of  motion  actually  used  are  correctly  reflected  in  the  trajectory  esti¬ 
mation  procedure. 

EQUATION-  OF  MOTION 

Given  these  restrictions,  the  situation  can  be  displayed  in 
spherical  coordinates  as  in  Fig,  l,  where  the  vector  meanings  are 
noted.  In  spherical  coordinates  the  equation  of  motion  in  one  plane 
is<18) 

V  «  r^r  -  r&2]  +  9^2*0  +  r'o]  -  F/ra  -  fD  G]  /m,  (50) 

where  the  dot  represents  the  time  derivative,  the  bar  represents  a 


vector,  and 
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RV  velocity 


unit  radial  vector 


nit  angular  vector  in  plane  of  motion 


radial  distance  to  RV 


-  angle  between  zenith  (z-axis)  and  radius  vector  to  RV 

-  total -force  vector  acting  on  RV 


-  drag-force  vector  acting  on  °V 


gravity-force  vector  acting  on  RV 


=  RV  mass 


Force  D  may  be  written  as 


where 


2  SVV% 


=  RV  drag  coefficient 


-  RV  frontal  (dragl  area 


air  density  at  alt i tut 


-  unit  vector  along  the  RV  line  of  flight,  or 


v  rri  4  t,Jpi 

TvT  ~  -2  '  2.2  ,1 

r  +  r  a 


Force  (1,  because  of  the  simplifying  assumption  about  Its  din 
tion,  may  be  written  as 


mg  cos  -r  e  mg  sin 


g  -  gravitational  acceleration 


z,  -  unit  vector  along  z-cootti  inate 


'  >\. 


ii 

: 
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Thus 

•  _ 

*  ■  [('2  CdVv2)  TvT  '  ”8  c°‘  0J  71 
+  [  (-  2  CdVv2)  Ivi  +  mg  8l“  e]  • 

and  combining  this  with  (50)  gives 

[r  -  r(,2']rl  +  [2r6  +  r'el^  -  -  [j  C^pV2  -[^  +  g  cos  e]rx 

-  [i  Vcpv2  MS  '  8  8ln  9]V  (51) 

Equating  coefficients  of  r^  and  0^  in  (51)  , 

r  -  r62  -  -  -  CpA^lvIj  -  g  cos  0  ,  (52a) 

2rQ  +  r6  -  -  \  VcpM^  +  g  sin  0  .  (52b) 

Substituting  |vj  =  (r2  +  r2^2)1^2  into  (52)  gives 

r  -  rb2  -  -  j  CjjAcp(r2  +  r2^2)1^2  ^  -  g  cos  9,  (53a) 

2^0  +  r0  -  -  \  ^^(r2  +  r202)1/2  ^  +  g  sin  6.  (53b) 

If  we  now  define 

a  .  __ 1 _ !_ 

WVc  "8/cu*c  ’ 

where 


W  ■  mg 

is  the  weight  of  the  RV,  there  results  the  final  form  for  the 
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Thus  we  can  write 


/ 


dX 

dt 


F(X) 


2  1  <»  /  2  ,  2  2. 1/2 
X1X4  "  2  ®6Px2'x2  +  xlx4^  "  8  cos  x. 


\ 


{"  2x2x4  •  \  ®8°xix4<x2  +  xlx4)1/2  +  8  8in  x3y 


(56) 


Here  p  varies  with  altitude  and  for  this  application  will  be 
assumed  to  vary  as 


P  -  pQ  exp  (-  ox)  «  pQ  exp  (-  or  cos  9). 


(57) 


This  uses  Z  as  the  effective  altitude  for  estimating  p  but  is  accept¬ 
able,  given  the  small  ground  ranges  met  in  the  current  application. 
Equation  (56)  can  then  be  rewritten  as 


/: 


£-F00 


21  222  1/2 
xlx4  “  2  B8Pqx2^X2  +  X1X4*  *XP  °*1  COS  X3^ 


g  C08  X, 


1  r  1  222  1/2 

—  {-  2*2*4  '  2  Bgp0xlx4(x2  +  xix4)  ^  exP  (*  C°8  *3* 

\  1  i 

+  g  sin  x3}^ 
(58) 


Equations  (55)  and  (58)  describe  the  physics  of  the  process. 

Values  for  p,  and  the  time  interval  of  observation,  have  been  selected 
so  as  to  result  in  highly  nonlinear  variations  in  position  and  velocity 
coordinates.  This  is  illustrated  in  Fig.  2,  which  gives  these  coordin- 
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ates  as  a  function  of  f  ^  rw1 .  as  well  as  the  values  used  for  ? ,  g,  c  , 
and  Also  shown  is  the  axial  RV  acceltrat  on,  to  illustrate  the 


h.ighly  nonlinear  nature  of  the  RV  forces  and  motion.  These  results 

(19) 

were  computed  using  ?  Rand  computer  program  called  ROCKET. 


RADAR  OBSERVABLES 


The  radar  is  assumed  able  to  measure  di.ectiy  the  RV  coordinates 
Xj  =  r ,  =  r ,  x-  =  * ,  with  measurement  being  exact  except  for  addi 

tive  noise.  Any  multiplicative  factors  could  have  been  inserted  into 
the  measurement  of  these  three  coordinates  with  no  added  complexity. 
This  is  equivalent  tc  taking 


H(s) 


/l  0  0  p\ 

\ 

0  1  0  0  I  ---  H, 

l  j 

\0  0  1  0  j 


so  that 


Z  (i.)  =  H(s)X(s)  +  N(s) 
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COMPUTATIONAL  APPROACH 


Mixed  Dif  fer  er.  t:  i  a  1- Equation/ New  Coni  an  Approach 

The  objective  of  a  computational  scheme  is  to  calculate  the 

A 

solution  X(T)  to  (25*  which  minimizes  (26): 


?  f(X,T> 


2A_1(X  -  u> 


-  2  !  7  X*(s;X)H*R‘1fZ(s) 

Jo 


-  HX(s;X)]ds  =  0, 


f(C,T)  = 


(C  -  u)*A_1(C 


-  HX(s;C) 


-*  -In 

s;C)  i  R  T Z ( i 


HX(s;C)1ds. 


Two  applications  are  possible:  real-time,  on-line  computations,  and 
nonreal -time ,  off-line  computations.  Both  applications  can  be 
approached  via  the  preceding  theoretical  developments.  The  approach 

A 

involves  solving  (27)  for  X(T)  as  the  observation  interval  for  Z(s)  , 
(0,T)  grows  with  time  T;  the  solution  is  initiated  by  using  the  initial 
conditions  given  by  (28): 


A\r  *  1  A  ★  1  A 

Jjr  =  2[Vc€f(X,T)r1Vcx  (T ;X)H  R"l[Z(T)  -  HX(T;X)1, 


x(0)  -  u;  ~  =  AH*R_1[Z(0)  -  Hpl. 
dT!T=0 


Sincp  the  solution  of  (27)  is  to  be  done  numerically,  it  really 
amounts  to  taking 
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X(T  +  AT)  =  X(T)  +  AT J  (59) 

using  (28)  as  the  s tat  ting  condition;  actually  a  suitable  numeric- 1 
integration  technique  is  used  to  effect  (59) ,  specifically  Runge- 
Kutta.  The  question  of  suitable  AT  size  immediately  arises.  Depend¬ 
ing  upon  the  magnitude  of  the  factors  in  (27),  very  small  AT  values 

A 

may  be  required  to  keep  the  resulting  X(T  +  AT)  from  diverging  from 
the  desired  exact  solution  to  (27).  When  this  situation  applies,  com¬ 
puting  times  can  become  so  great  as  to  be  impossible,  even  for  off¬ 
line  applications. 

Since  we  arc  estimating  the  initial  condition  X(0)  ,  X(T)  need  not 
charge  rapidly  with  T;  for  example,  it  certainly  need  not  change  as 

A 

dX 

per  F(X),  It  is  reasonable,  therefore,  to  consider  weighting  ~r  in 
(59)  so  as  to  decrease  its  destabilizing  effect  for  a  given  size  AT: 

X(T  +  AT)  -  X(T)  +  V  AT,  (60) 

where  Y  is  a  constant,  0  •£  V  <  1 . 

A 

Naturally,  for  finite  AT  and  constant  V  the  value  for  X(T  +  AT) 
resulting  from  (60)  will  differ  from  that  value  which  truly  minimizes 
(26).  However,  if  it  is  close  enough  to  the  correct  value,  then 
steepest-descent  or  Newtonian  techniques  can  be  used  to  improve  the 
estimate  for  constant  T.  For  example,  if  Newton's  method  is  applied, 

A 

then  a  sequence  of  improving  estimates  for  X(T)  results  from  taking 

Xn+1(T)  -  Xn(T)  -  F7ccf(XnrTl  ,T)  (Xn[Tl  ,T)  .  (61) 

It  is  immediately  clear  that  several  exchanges  can  be  made  between 
these  two  techniques,  in  order  to  gain  either  computational  accuracy 


.r****.*,  w  «u*, *W«,V 
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Gr  efficiency,  or  both.  Note  that  even  for  Y  ~  0,  the  application  of 
(60)  and  i 6 1)  affords  an  approach  as  near  as  desired  to  the  correct 
X(T) ,  provided  the  conditions  for  Newtonian  convergence  are  met. 

These  conditions  are  (see  Ref.  20,  p.  63) 

If  the  left  member  of 


V(C.T)  ' 


satisfies 


1#  5  dj  ,  j  =  1,  m, 


2.  The  matrix  V  f(X  ,T)  has  a  nonvanishing  determinant  R  (with 
absolute  value  |Dj)  and.  for  the  absolute  va'ue  |A.  j  of  its 

L  J 

cofactors  , 


7  TdT  ^Kj1  5  V 


3  ** 

3.  The  elements  of  2  f(X,T),  s  =  iQ.  are  bounded : 

CCC  ijk 


!3ijki  *  d3 


In  the  region 


i  -yi  -  2b 


max  x. 


dld2’ 
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whore  X 


hi 


S 

m  / 


and 


is  defined  by  and  satisfies 


b0  =  d2d ld3m^ 


1  !  9 
*■'  1 1 


then  the  system  7^f(X,T)  =  0  has  a  solution  which  can  be  obtained  by 
Newton's  method  (61), 

Joint  application  of  (60)  and  (61)  would  then  proceed  as  follows, 

dxl 


At  T  =  0,  take  X(0)  =  u ;  compute 


e  s  t ima  t  e 


dT 


from  (28)  and  apply  (60)  to 


T=0 


X° ( AT j )  =  u  +  VAH*R_1fZ(0)  -  BulATj. 


Then  apply  (61)  until  X  (AT^)  satisfies  certain  conditions,  sach  as 

all  coordinates  of  V^f(xntATjl,  AT^)  being  less  than  some  constant,  or 

all  coordinates  of  [ Xn( AT ^ )  -  Xn~ l( AT^) V j |xu  1 ( AT j ) I !  being  less  than 

some  constant.  Taking  this  last  estimate  for  X(AT^)  ,  then  apply  (60) 

to  compute  X  (AT.  +  AT?)  and  so  forth  until  T.  AT  -  T,  the  upper  limit 

i  1 

of  the  observation  interval.  We  will  refer  to  each  application  of 
(61)  as  a  Newtonian  iteration.  Application  of  either  (60)  or  (61) 
involves  numerical  calculation  of  the  various  factors  occurring  there¬ 
in,  and  as  these  equations  suggest,  involve  essentially  the  same 
amount  of  computing  time.  Assuming  this  to  be  case,  it  i.-,  possible 
to  make  a  rough  estimate  of  the  alterations  in  computing  time  possible 
via  (60)  and  (61),  as  follows. 
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Computat  ion  T ime 

In  general,  the  integration  interval  As  used  in  the  numerical 
calculation  of  the  factors  of  (60)  and  (61)  need  not  equal  AT;  in 
general  it  will  be  preferable  not  to  have  As  =  -AT,  in  order  to  permit 
sufficient  data  density  over  (0,T)  even  though  AT  is  fairly  large. 

For  reasonably  large  T,  the  calculation  t irce  is  dominated  by  the  per 
As  integrations. 

Let 

AT  -  step  size  at  which  update  X(T)  in  time  T 
As  -  time  interval  used  In  numerical  integration  for 
calculating  the  factors  in  (60)  and  (61) 

u  =  AT/ As 

5  =  computer  execution  time  per  As  calculation,  assumed 
the  same  in  application  of  either  (60)  or  (61) 

T]  =  number  of  Newtonian  iterations  at  time 
T1  ■  average  number  of  Newtonian  iterations  per  time 
T  =  total  time  of  observation  to  date 
r  =  computer  execution  time  at  time 
t  =  total  computer  execution  time  for  processing  signal 
over  (0,T) 

N  =  T/AT  =  number  of  updatings 

At  time  T  ,  the  computer  »  st  process  numerical  integrations 
Ti 

across  —  intervals  in  each  calculation  of  the  factors  of  (61),  plus 

As 

once  again  for  the  updating  via  (60).  Thus  we  can  write 


6(11  +  1)T  6(T).  +  1) 


A8 


As 


1  AT  *  i  5  ( Ti  +  1) 


AT 

As’ 


i 

i 


! 
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or 


Then 


r,  -  16(-n  +  l)v>. 

L  1 


N 


N 


N 


T  =  I  T  =  1)6  ’  i(Th  +  1)  ¥  "6(T!  +  1)  £  i 

1=1  1  i  =  ’  1=1 


(62) 


nSCn.  +  1)N(N  +  1) 


so  that  from  N  =  —  and  N  +  i  =  N  there  allows 


,  il.\  =  iLELLil  _lL 

T  2  Vat  /  2  AsAT' 


(63) 


E 


It  is  th  variation  «  f  total  computer  execution  time  with  that 

forces  the  us.  of  • -ther  large  As  values,  as  well  as  the  use  of  rather 
large  AT, 

E'en  witu  only  a  four-  ocrdiuate  vector  X(s) ,  the  number  of  equa¬ 
tions  which  must  be  integrated  numerically  to  apply  (60)  and  ^61)  gets 
excessive  unless  th  computer  is  of  special  design.  The  computers 
available  are  far  from  specially  designed,  naturally,  and  further, 
a (ready -programmed  numerical  procedures  have  been  used  whenever  pos 
sible  to  reduce  programming  ffort.  Ine  result  is  that  the  procedure 
used  here  for  computation  is  very  inefficient,  timewise,  taking  about 
one  second  per  complete  As  calculation  ( integration) --i .e  ,  6  ?  1  sec¬ 
ond.  The  resulting  effect  upon  t,  given  by  (63)  ,  is  plotted  in  Fig.  3 

6(Tl  +  ’)  4 

is  a  function  of  T  ior  —  .  ~  -  10  .  Since  t  scales  linearly  with 

AsAT 


AsAT 


,  Fig.  3  can  easily  be  read  for  values  other  than  10  for 


this  fac'or.  Except  for  r  sther  small  values  of  T,  Fig.  3  sugge.  ts 
X(T)  caicuia  'on  times  two  or  three  orders  of  magnitude  greater  than 
the  real-time  observation  being  processed  Sine,  it  appears  that  such 


<  X 
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A  A 

Given  an  X(T) ,  to  apply  (60)  to  the  calculation  of  X(T  +  AT)  re- 

*  - 1 

quires  the  calculation  of  the  factors  in  (27',  that  is,  ^^f(X,T)^ 

A  A 

7^X(T;X),  X(.T;X),  and  the  performance  of  manipulations  indicated  in 

A 

(60).  First  consider  X(T;X).  A  basic  given  in  the  problem  statement 
is  the  differential  equation 


=  F(X) .  (64) 

The  Runge-Kutta  integration  routine  in  the  Rand  program  library 
has  been  directly  utilized  to  compute  X(T;X)  via  (64).  Directly  from 
( 64)  , 


X(T;X) 


l 

/ 


<g..i£lXJL 

ds 


ds  +  X  = 


r 

J 


F  X(s;X)  Ids  +  X, 


(65) 


so  that 


AX (I  ;X) 

Ac, 


T 

r 


=  / 


ds  +  I. 


k 

/  X  (  s  ; 


x)FrX(s;X)  1  ds 

k 


+  1 


k’ 


( 66a) 


where 
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Equation  (66a)  may  be  rewr  -ten  as 


A 

•-  r 


*,FrX(s;X>  ’7X(s;X)ds  +  I 
X ( s  ;X)  C 


(67) 


where  I  is  the  mXm  identity  matrix. 


Equation  (6/)  can  then  he  written  as 


>7cX(T;X) 

vf 


•  7X(T;X>prX(T;X'''cX<^X)’ 


(68) 


and  the  same  Runge-Kutta  integration  can  be  used  to  calculate  7  X(T:X) 
via  (68),  once  X(T;X)  is  known  via  (64),  using  as  the  initial  condi¬ 
tions 


7  X(0,X)  -  I  ,  F r X ( 0  : X )  =  F(X)  . 

v 


(69) 


'7ccf(X,D 

r  i  j...  (19),  — — ^ - may  be  vi  tten  as 


^ccf(x,T) 


27cX*(T ;X)H*R* !H7cX(T ;X) 


-  27c„X*(T;X)H*R'lrZ(T>  -  HX(T;X) 


wi  th 


CC 


,f(X,0)  -  2 A ” 1 . 


(70a) 

(70b) 


The  vectors  X(T;X)  and  7^X(T;X)  are  availab’.e  from  (64)  and  (68)-(69); 
similarly,  from  (67), 


*7  x(T ;X)  r  d,f7v/_iiNFrX(8;X)  7fj[(s;*)l 


C  2  Xd;*)1 

J  *c> 


ds 
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I  P  7  i\FrX(s ;X) ' 1 

f  \ - KsJLi - -  X(s;X) 

I  i  ^c.  c 


"  X(s;X)', 

*  W>rf)“*,*r  ~7— j e' 


W>«.db,'M'ii>'  ^  V<’*> 


+  "  ,^FrX';s;X) 

X  {  s  :X  j 


,X(s;XV 


cs  , 


which  is  equivalent  to 


?CCX(T;X)  *  /  ?X(.,X,X(s;X)FrX's;X)'VCX(s:XrcX<s-X)d8! 


f  r  - 

X(s;X) 


F  X(  s  ;X  i  '7  X(s  ;X)ds 


X(s;X)  . .  CC 


which  can  he  rewritten  as 


*7X(T;X) 

— — - -  7  *  ,  '  F'XiT  iX'i  '7  X(T  ;Xi~  X(T  .X> 

X(T;X)X(T;X)  C  C 


x(T;iirrx(T;x)V(Ti'- 


1  7  \ a) 


vh  ere¬ 


ct 


X  (  0  ;  X  )  t. 


(  ’SM 


Again,  the  same  Runge-KuCta  techniques  can  he  used  to  solve  (  ~  1 '  , 


given  outputs  £r»>m  similar  Kunge-Kut  ta  appl  Hat  ions  tv  (*.>**1  and  ( n“*i 


■4  ft  - 

for  X(T;X)  and  7  X(T;X).  All  the  inputs  for  solving  (70)  are  then 
L 

available,  which  can  also  fc^  solved  via  the  same  Runge-Kucta  technique. 

A 

V(X;T)  is  then  inverted,  and  —  formed  via  the  multiplications 

shown  in  (27).  We  are,  in  effect,  solving  simultaneously  via  Runge- 

Kutta  techniques  the  equations  (64),  (68),  (70),  and  (71).  This 
2  3  2  2  3 

involves  m+m  +  m  +m  =  m  +  2m  +m  simultaneous  equations;  for 
tn  =  4  this  tc^al  is  100.  Given  this  solution  for  a  given  T,  the  same 

A  A 

Runge-Kutta  technique  then  used  to  update  X(T)  to  X(T  +  AT)  as  per 
(60), 

In  addition  to  the  preceding,  it  is  necessary  to  calculate 
7  f(X,T)  in  order  to  apply  (61).  From  (25), 

aV(X,T)  *  *  *  , 

— «  -  27cx  (T;X)H  R  [Z(T)  -  HX(T;X)],  (72) 

which  can  be  solved  immediately  with  the  preceding  set,  leading  to  a 
total  of  104  simultaneous  equations.  This  explains  the  large  computer 
execution  times  per  As. 

The  actual  implementation  of  the  preceding  is  sketched  in  the 
program  flow  diagram  in  Appendix  B. 

COMPUTATIONAL  RESULTS 

The  objectives  of  the  calculations  performed  are  several.  The 
first  is  simply  to  show  that  the  maximum-likelihood  estimator  derived 
above  can  be  numerically  calculated.  Another  aim  is  to  show  the  rate 
of  its  convergence  and  the  usefulness  of  the  limit  to  which  the  MLE 
tends.  A  third  goal  is  to  illustr-te  somt  of  the  computational 
choices  which  must  be  made  (e.g.,  magnitudes  of  AT  and  As)  and  their 
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eftects.  Finally,  it  is  desired  to  display  relative  to  the  computed 
solutions  some  numerically  derived  values  for  the  Cramer-Rao  bound. 

Computations 

The  nonlinear  and  complicated  nature  of  F(X)  makes  it  necessary 

to  calculate  X(s;Xq)  for  any  given  Xq  numf  cally.  Although  this  can 

(19) 

be  done  quite  easily  with  the  ROCKET  program,  it  is  relatively 
expensive  in  computer  time.  In  part  for  this  reason,  the  computations 
to  be  described  were  made  for  only  one  X^,  that  corresponding  to 
Fig.  2: 


f  77929  ft 
-  21836  ft/sec 
.83200  rad 


•.011849  rad/sec. 


Variations  in  relative  to  u,  the  expected  value  of  X^,  were  accom¬ 
plished  by  changing  u.  Two  such  variations  were  examined,  one  corres¬ 
ponding  roughly  to  X^  -  4  coordinates  of  two,  twenty,  twenty,  and  two 
times  their  standard  deviations  (about  their  4  values,  respectively); 
and  another  corresponding  to  XQ  -  u  coordinates  of  three  times  the 
above  differences.  The  smaller  excursion  was  used  as  the  principal 
case  and  corresponds  tc  an  excursion  which  should  not  usually  be 
exceeded;  the  relative  magnitudes  were  chosen  to  test  for  convergence 
by  coordinate.  The  more  extreme  excursion  was  used  simply  to  test 
initiation  and  convergence  for  an  extreme  deviation  from  the  Initial 
value  of  the  estimator  at  T  *  0:  4. 


% 


| 


-48- 


The  trajectory  corresponding  to  the  above  X^  was  calculated  at 
-3 

10  -second  steps  over  a  5 -second  interval  preceding  (and  near)  RV 

impact.  This  served  as  the  basic  trajectory.  For  each  noise  matrix  R 

-3 

used,  a  random  sequence  of  noise  inputs  was  generated  at  10  -second 
steps,  and  a  sample  observation  waveform  Z(s) ,  C  s  s  s  5  seconds,  cal¬ 
culated  from 


Z(s)  *=  HX(s;XQ)  +  N(s)  . 

Given  the  random  (noise)  component  in  Z(s),  it  is  impossible  to  store 

_3 

Z(s)  over  all  points  of  a  time  continuum  in  closed  form;  but  As  *  10 
second  was  an  adequate  density  for  our  purposes. 

Only  two  noise  matrices  (R)  were  used,  one  representing  a  nominal 
radar  performance,  the  other  an  extremely  accurate  performance  to  test 
the  impact  of  R  upon  estimator  initiation  and  convergence.  The  former 
(nominal)  R  was  used  as  the  principal  case.  Those  used  were,  the 
first  being  the  principal  case. 


R 


42fd 


l(ft/sec)‘ 


o 


3.04X10 ’^rad2 


It  is  useful  to  express  the  last  component  of  R  in  square  feet,  to  make 


it  visibly  comparable  to  the  first  two  components.  It  becomes  at 


R 


With  R  in  this  form  it  is  more  clear  that  the  angular  accuracy  ascribed 
to  the  radar  is  not  out  of  line  with  the  range  accuracy. 

Given  the  preceding  selections,  there  remains  only  A  unspecified. 
Its  value  principally  has  two  effects.  First,  its  magnitude  coupled 
with  that  of  R  Influences  (in  some  complex  way)  the  feasibility  of 
numerically  inverting  7  f(X ,T) --and  therefore  that  the  basic  differ¬ 
ential  equation  (27)  will  be  applicable  and  soluble  for  all  T  of  inter¬ 
est.  This  interaction  between  A  and  R  was  explored  in  a  very  limited 
way  by  varying  R  as  per  above.  Second,  the  magnitudes  of  A  and  R  help 
determine  what  values  of  AT  (the  step  sire  used  in  the  numerical  up¬ 
dating  in  time  of  the  estimator)  can  be  used.  That  this  is  the  case 

A 

can  be  seen  by  examining  the  expression  (28)  for  —  at  T  ■  0: 


-50- 


The  vector  —  gives  the  direction  in  which  X(T)  should  be  changed;  how 
ever,  use  of  too  large  a  value  for  AT  in  (60), 


A 

X(T  +  AT)  =  X(T)  +  V  ~  AT, 


(60) 


will  cause  divergence  from  the  required  solution.  It  is  clear  from 
(28')  that  large  A  tends  to  emphasize  the  influence  of  the  radar  mea- 

A 

surements  on  ~  and  large  R  to  emphasize  the  a  priori  statistics,  as 
should  be  the  case.  Altering  AT  does  not  alter  the  direction  of  change 

A 

dX 

given  by  it  only  alters  the  magnitude  of  the  change  in  that  direc¬ 
tion. 

As  explained  earlier,  the  computing  routine  used  here  employs  as 
many  existing  programs  as  possible  and  is  therefore  expensive  in  com¬ 
puter  time;  therefore,  relatively  large  AT  values  are  preferred--in  the 
-3  -1 

range  10  to  10  seconds.  These  factors  constrained  the  computa¬ 
tions  to  the  use  of  just  one  A  and  the  use  of  various  values  of  AT 
in  order  to  examine  its  influence  upon  initiation  and  convergence. 

The  A  used  was 


A 


/io6ft2 

;  ? 

10  (ft/sec) 


3X10"4rad2 
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10  (raa/sec) 


1 


It  is  of  interest  to  express  all  coordinates  of  A  In  feet  and  display 

if? 

A,  ",  in  order  to  indicate  the  comparability  of  all  the  error  compo- 
i  “ntt?  and  to  make  clearer  the  a  priori  error  magnitudes  involved: 
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Thete  remains  to  be  assumed  the  step  size,  As,  to  be  used  inside 
the  integration  routines  for  fixed  T--e.g.  ,  in  the  calculation  of 
X(T;Xq).  In  these  calculations ,  this  has  been  taken  equal  to  AT  and 
thus  has  fallen  in  the  range  of  10  to  10  seconds  and  was  varied  with 
AT  to  examine  the  effect  upon  estimator  performance.  The  principal 
effect  that  As  size  has  is  to  introduce  a  gr-'nular  structure  into  the 
surface  f(C,T),  the  minimum  of  which  is  being  estimated.  Clearly,  the 
accuracy  of  estimation  possible  will  be  limited  by  this  granularity 
and  therefore  by  As. 

Computational  results  will  be  shown  in  terms  of  tue  absolute 
er~or  between  individual  cr-nponents  of  the  estimator  and  the  actual 
initial  condition  Xq ;  also  shown  will  be  the  same  errors  transformed 
to  the  equivalent  absolute  errors  in  distance  and  in  velocity.  For 
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we  define  the  errors  of  interest  as 


er  =  Ir0  ‘  V 


8r  =  ifo  -  V 


€e  =  !eo  ‘  eo 


€e  =  |0o  '  V 

r  2  2  2-i  1/2 

=  er  +  rCe9J 

r  2  2  2,1/2 

~  I  e.  +  r  ei  , 

V  r  OS 


where  and  are  the  errors  in  distance  and  velocity,  respectively. 


Pure  Differential -Equation  Solution 

In  Fig*’.  4  and  5  are  shown  these  errors  as  a  function  of  time  T 

-1  -2  -3 

for  several  fixed  values  of  As  »  AT  (10  ,  10  ,  and  10  seconds). 

In  these  estimates  Y  1  and  no  Newtonian  improvements  to  the  pure 
differential  equation  solution  were  employed.  TTie  different  ranges 
of  T  over  which  the  errors  are  shown  (for  different  As)  result  be¬ 
cause  of  the  greater  computing  time  required  fo*-  the  smaller  As 
values,  The  intent  is  to  show  enough  T  range  collectively  to  indi¬ 
cate  the  performance  of  the.  estimator. 

Figure  4a  indicates  that  for  As  -  10  *  seconds  r^  is  either  di¬ 
verging  or  tending  to  a  limit  some  distance  from  r^.  Three  things 
are  probably  occurring.  First,  the  granularity  Introduced  into  the 
surface  f($,T)  by  a  finite  sampling  interval  As  ■  10  *  seconds  is 
probably  such  that  great  accuracy  is  not  possible.  This  is  borne 
out  later  by  the  differential -equation/Newtonian  calculation,  which 


T  ~~  fin  »  ( seconds) 


apparently  has  an  asymptote  at  about  1500  feet  which  should  depend 
mainly  ur>on  this  granularity.  Second,  AT  *  10  L  seconds  is  probably 
just  too  large,  with  resulting  divergence  from  the  desired  solution. 
Third  as  will  be  shown  below,  this  estimator  will  sacrifice  perfor¬ 
mance  in  one  coordinate  in  order  to  improve  overall  performance,  and 
this  may  be  occurring.  Similar  remarks  apply  to  Fig.  4b  for  As  =>  10  ^ 

A 

seconds,  which  displays  g^.  In  Fig.  4c,  it  appears  that  8q  is  cor/.-erg- 
ing  usefully.  Figure  4d ,  however,  indicates  that  for  As  «  10  seconds 

A 

0  is  diverging  rapidly.  In  Fig.  5  are  shown  the  errors  and  ^ .  It 
appears  that  is  converging  to  zero;  since  is  a  combination  of 

A 

and  ep ,  it  would  appear  that  the  behavior  of  depicted  for 
As  10  *  seconds  in  Fig.  4a  was  in  part  an  attempt  by  the  estimator 
to  produce  a  better  overall  miss -distance  result.  Similar  remarks 
apply  to  for  As  *  10  1  seconds,  presented  in  Fig,  5b.  However,  it 
is  clear  from  Fig.  5b  that  AT  =  As  =  10  *  seconds  is  too  large  for 
convergence , 

-3 

Figures  4  and  5  indicate  that  As  values  of  10  and  10  seconds 

-2 

provide  convergence.  The  slight  divergence  of  for  As  R  10  sec¬ 
onds  (Fig,  4a)  Is  shown  in  Fig.  5  to  be  the  result  of  an  overall 
improvement  in  and  ,  as  discussed  above.  However,  two  results 
shown  in  Figs.  4  and  5  deserve  special  discussion. 

First,  although  e  ,  e  ,  and  el  all  appear  to  converge  rapidly 

ro  90  eo 

to  limits  near  zero  g.  does  not  (Fig.  4b);  similarly  for  the  rapid 

ro 

convergence  of  toward  7ero ,  but  not  gy  Near  T  *  0  this  is  due  in 
part  to  the  relative  emphasis  placed  upon  the  radar  measurements  by  the 


eleinen*'"  >  f  A  and  R  and  can  b.st  be  visualized  by  examining  at 

dT 


cix 

T  =  0  given  by  (28).  Given  the  diigonal  nature  of  A,  R  and  R,  ~ 
takes  the  form,  substituting  in  the  values  of  fa^]  and  f ? { ]  from 
and  (nominal)  R, 


T=0 


dX 

dT 


T»=G 


lOOCz^O) 

*2(0) 


-  u. 


23(°) 


v 

V 


U3  / 

y 


This  value  of  0  in  che  fourth  coordinate  will  change,  of  course,  as  T 
increases.  It  is  apparent  from  this  expression  that  the  r^  estimate 
will  emphasize  the  radar  measurements  much  more  than  will  the  estimate 
of  r  in  the  vicinity  of  T  =  0.  The  relative  performance  of  the  two 
estimates  shown  in  Figs.  4a  and  4b  near  T  *  0  is  therefore  to  be 
expected,  and  similarly  for  Figs.  4c  and  4d,  inasmuch  as  the  radar 
errors  are  rather  small.  The  degree  of  radar-measurement  emphasis 
away  from  T  =  0  will  then  depend  upon  the  growth  of  the  product  of 
[^^f  (X  ,T)  ]  *  and  V^x  (T;X)  (see  (27)).  It  would  appear  that  such 

A 

growth  is  not  as  favorable  to  convergence  of  as  for  the  other  esti¬ 
mates.  Naturally,  the  poor  convergence  of  Cy  in  Fig.  5b  results 
because  contains  «^.  Is  the  poor  convergence  shown  by  and  €y 
worth  the  effort?  The  answer  lies  in  two  quarters.  First,  If  the 
initial  error  (between  ^  and  t q)  had  been  larger,  then  convergence  to 
the  shown  level  would  have  been  more  impressive;  this  Is  discussed 
later  relative  to  such  large  excursions  (see  Fig.  8).  The  second 
aspect  deals  with  the  other  result  of  Figs.  4  and  5  deserving  greater 


discussion:  estimation  of  shown  in  Fi'*,.  4d. 


In  the  present  application  the  variable  P  cannot  be  measured 

directly  by  the  radai  and  so  nwst  be  estimated  entirely  in  terms  of 

the  other  observables.  Therefore,  6  can  be  considered  an  unknown 

parameter  (which  has  an  a  priori  statistical  distribution) .  It  is 
(* 

seen  that  converges  for  appropriate  .is  values;  e^,  which  includes 
s •  ,  also  ben*  appropriately  with  increasing  T.  This  is  discussed 
in  greater  detail  below,  under  the  differential -equation/Newtonian 
solution . 

Differential  -Equatlon/Newt.on  ian  Solution 

If  As  were  infinitesimal,  then  use  of  the  Newtonian  technique 
to  improve  the  pure  differential -equation  solution  for  finite  AT 
should  re^ulc  in  an  exact  solution  corresponding  to  infinitesimal 
AT  (given  convergence  of  the  Newtonian  iterations  and  within  the 
limits  of  computational  accuracy).  Since  As  is  finite  in  these  cal¬ 
culations,  only  smaller  improvement  relative  to  this  theoretical 
maximum  can  be  expected.  In  the  results  that  follow,  once  converg¬ 
ence  for  the  differential -equation/Newtonian  estimator  was  indicated, 

-3 

values  for  small  As  (10  seconds)  and  large  T  were  calculated  by 

A 

dX 

setting  =  f)  and  using  just  the  Newtonian  iterations.  That  is, 
for  such  computer -expens ive  cases  V  =  0  was  used  in  (60).  Alterna¬ 
tively,  large  AT  was  used  to  get  an  approximate  starting  solution 
for  large  T,  at  which  T  smaller  As  was  then  used  in  the  Newtonian 
iterations  to  estimate  the  solution  for  AT  ■  A9 • 

Results  for  the  joint  routine  are  displayed  in  Figs.  6  and  7. 

-1 

It  is  seen  that  again  tor  As  =*  10  seconds  (as  discussed  earlier) 
approaches  a  limit  considerably  greater  than  zero.  All  the  other 
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components  converge  even  at  As  =  10'1  seconds,  and  €[)  and  ^  converge 

-1  -2  -3 

for  all  values  of  As  employed  (10  ,  10  ,  and  10  seconds).  The 

apparent  slight  divergence  of  and  after  their  initial  convergence 
again  results  from  an  overall  improvement  in  the  estimation,  as  is 
borne  out  by  reference  to  and  e^  in  Fig.  7.  Again,  and  do  not 
converge  toward  zero  as  rapidly  as  the  other  errors. 

Also  shown  in  Figs.  6  and  7  are  the  corresponding  unbiased  Crame 
Rao  lower  bounds  on  the  conditional  RMS  error.  It  is  clear  that  the 
estimator  converges  with  increasing  T  to  a  solution  and  that  the  solu¬ 
tion  is  a  useful  overall  estimate  of  X^  of  the  order  of  ^he  Cramer-Rao 
bound.  In  particular,  the  estimation  of  for  which  there  are  no 
direct  radar  observables,  is  good.  It  is  this  excellent  estimate  of 
9^  (and  of  P^)  which  makes  these  results  useful  even  though  is  not 
handled  as  well.  For  unless  all  coordinates  are  estimated  reasonably 
well,  such  accuracy  in  particular  ones  is  impossible.  But  the  esti¬ 
mator  has  exchanged  accuracy  between  coordinates ,  thereby  reducing 

-  ^ 

e^,  for  which  there  is  no  direct  measurement,  to  less  than  10 
radians/second,  or  less  than  3  percent  of  its  original  magnitude;  eQ 
is  reduced  from  0.347  radians  to  less  than  10  ^  radians,  or  less  than 
0.3  percent  of  its  original  magnitude;  and  goes  from  2000  feet  to 
less  than  200  feet,  or  less  than  10  percent  of  its  original  size.  It 
is  this  overall  estimate  of  Xn ,  and  especially  of  the  coordinate 
f  r  which  no  direct  radar  measurements  are  available,  that  makes  these 


results  of  interest. 
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Extreme  Excursion  of  (X^  -  n) 

The  above  example  used  a  value  of  X^  differing  from  its  expected 
value;  a,  by  about  two,  twenty,  twenty,  and  two  standard  deviations  in 
its  four  coordinates,  respectively.  To  explore  the  initial  conve-- 
gence  of  the  estimator,  a  u  differing  from  in  each  coordinate  by 
three  times  the  above  differences  was  employed.  Convergence  of  the 
dif ferential-equat ion/'Newtoatan  solution  for  this  case  is  shown  in 
Fig.  8,  where  it  is  seen  that  by  time  T  -  0.25  seconds  the  solution 
has  converged  to  that  corresponding  to  the  smaller-deviation  case. 

At  T  =  3  seconds  these  solutions  correspond  to  errors  in  distance  (D) 
and  velocity  (V)  which  are  about  i  percent  and  25  percent  of  the  ini¬ 
tial  error  at  time  T  =■  0.  This  performance  for  large  initial  errors 
is  another  reason  that  relatively  poor  performance  in  one  coordinate 
(r^)  for  the  more  nominal  case  of  Figs.  4,  5,  6,  and  7  may  be  accept¬ 
able,  as  discussed  earlier. 

Excursions  In  A 

As  discussed  above,  AT  must  be  sized  so  as  to  match  A  and  R  if 
initiation  and  convergence  are  to  occur  successfully.  The  preceding 
variation  of  AT  over  two  orders  of  magnitude  indicate.-:  the  range  of 
insensitivity  of  this  variation  'tor  the  particular  example.  Experi¬ 
ence  with  the  routine  illustrates  that  the  R  and  A- to- AT  relationship 
is  a  very  sensitive  one. 

Excursions  In  R  and  Estimated- to-Actua l  Noise 

The  secondary  R-m.itrix  given  on  page  48  was  used  to  explore  its 
effect  upon  initiation  and  convergence.  Neither  the  pure  differential-  | 

|  i 
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Fig. 8a — Miss-distance  error  for  differential -equation/Newtonian 
solution  for  moderate  and  extreme  initial  conditions 


j.8b  —  Velocity  error  for  differential -equation/ Newtonian 
solution  for  moderate  jnd  extreme  initial  conditions 
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equation  nor  the  d I ffereutla! -equation/Newtonian  estimator  would  con- 

-1  -2  -3 

verge  appropriately  for  As  values  of  10  and  10  .  For  As  *  10  the 
pure  differential -equation  estimator  converged  well,  as  shown  In  Fig.  9. 

It  Is  also  of  Interest  to  explore  convergence  of  the  estimator  If 

the  noise  does  not  correspond  exactly  to  the  R -matrix  employed.  This 

_2 

Is  '"hown  for  as  »  10  seconds  <n  Fig.  9,  where  the  terminology  Is  de¬ 
fined  as  follows: 

o  No  nolse--large  R:  primary  R-matrlx  was  used  In  equations, 
but  there  was  no  noise  added  to  radar  observations. 

o  Small  nolse--large  R;  primary  R-matrlx  was  used  In  equations, 
but  noise  added  to  radar  observations  corresponded  to  second¬ 
ary  R-matrlx  , 

o  Large  noise --targe  R:  primary  R-matrlx  was  used  In  equations, 
and  noise  added  to  radar  observation  corresponded  to  primary 
R-matrlx . 

Hie  T  range  rhovn  Is  great  enough  just  to  show  how  rapidly  all  three 
cases  converge  to  the  same  solution.  Independent  of  the  es t Imated -to - 
actual  noise  relationship.  It  Is  clear  that  as  long  as  R  Is  "larger" 
than  or  equivalent  to  the  actual  noise,  convergence  takes  place.  The 
contrary  case--nolse  "larger"  than  R--should  also  converge  for  appro¬ 
priately  small  aT.  However,  its  calculation  requires  considerably 


more  machine  time  and  was  not  none  here, 


V .  CONCLUSIONS 


These  calculations  did  not  compare  MLE  performance  to  other 
estimators.  Since  computational  results  by  other  researchers  (see 
Ref  13)  had  indicated  the  greater  accuracy  of  the  KLE  in  the  scalar 
case,  the  calculations  here  were  designed  to  test  the  feasibility  of 
applying  MLE  techniques  to  complicated  vector  situations.  To  that  end, 
they  examined  convergence,  limit  of  convergence,  effect  of  initial 
error,  noise  magnitude,  accuracy  of  noise  estimation,  and  ability  to 
estimate  unknown  parameters. 

THEORETICAL  RESULTS 

1.  The  likelihood  expression  p(rjZ(s),  0  «  s  ^  T) ,  upon  which  the 
maximum- like  1 ihood  estimate  for  the  initial  condition  is  based,  wa? 
derived  and  then  spec  iali red  to  the  white -noise  case.  The  desired 
maximum- 1  ike  1 Ihood  estimate  is  that  initial  condition  C  which  maxi¬ 
mizes  p<c!z($) ,  0  s  s  <  T) . 

2.  The  max  imam- 1  ike  1 ihood  estimate  was  shown  to  satisfy  (as  a 
function  of  the  upper  observation  limit  T)  a  differential  equation  (27) 
of  relatively  simple  form;  it  was  also  shown  that  the  resulting  com¬ 
putational  algorithm  can  be  usefully  applied. 

3.  The  conditional  Cramer -Rae  bound  for  the  max  taut.-  1  ike  l  ihood 
estimator  was  developed  and  an  .tporoximar  lot.  derived, 

4 .  The  maximum-! ike  I ihood  estimator  for  a  spec  lie  problem  waa 
then  derived  from  the  preceding  theoretical  results  and  applied  tc>  a 
wtt  ite-nolae  case:  similarly  for  the  C'6-ier-Rao  bound. 
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NUMERICAL  RESULTS 

1-  For  appropriate  values  of  As  =■•  AT  the  estimator  converges  to 
a  useful  solution  for  X^,  ever,  given  extreme  initial  errors.  The  con 
aitional  error  appears  to  be  of  rhe  same  order  as  the  Cramer-Rno  lower 
hound , 

2.  In  the  application  presented,  the  variable  9  cannot  be  me a cured 
by  the  radar,  and  so  4  must  be  estimated  entirely  in  terms  of  rhe  other 
observables.  Therefore,  9  c.,.  •  be  considered  as  «r»  unknown  parameter 

(with  known  a  priori  statistics).  The  estimator  approaches  the  true 
value  of  §  ,  as  well  as  those  of  r.„  and  9  s  with  considerable  accuracy, 

w  V  o 

% 

This  is  done  at  the  expense  of  relatively  slower  convergence  of 

3,.  As  a  function  of  T.  the  individual  estimates  of  r  ,  *  ,  9^. 
and  9,,  may  not  converge  mono  topically  but  my  vary  so  as  to  improve 
the  overall  estimate  of  X„ » 

4,  Convergence  the  pure  differentia  1 -equation  solution  esti¬ 
mator  depends  critically  upon  the  value  of  AT  used.  This  is  to  be 
expected,  since  AT  is  the  factor  by  which  -.he  derivative  of  X(T)  is 
weighted  in  forward  projections  in  time.  The  nature  of  this  dependence 
is  influenced  by  A  and  R. 

5,  Convergence  of  the  differential -equation/ Newtonian  estimator 
is  less  dependent  upon  the  value  of  AT;  both  estimators  have  basic  ac¬ 
curacy  limitations  determined  by  the  size  of  As,  which  introduces  a 
granularity  into  tne  surface  f(X,T), 

6,  Initiation  difficulties  presented  by  large  values  for  -  ;i 
can  be  overcome  by  the  use  of  sufficiently  small  values  fot  AT.  A 
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vai'iable  AT,  increasing  with  T,  could  perhaps  be  used  in  future  pro¬ 
grams  . 

7,  For  the  ease  at.  hand .  3  to  4  seconds  of  reel  track  rime  suf- 
ficed  for  essential  convergence,  at  fa  =  10  "*  seconds. 

3,  Computer  time  required  wns  several  orders  of  magnitude  greater 
that;  the  real-time  observation  processed.  With  specially  written  pro¬ 
grams  ami  special-purpose  mamines  this  should  be  reducible  to  on- 
5  ioe  proportions,  Given  the  apparent  accuracy  advantages  of  &LE  over 
certain  approximations  evaluated  heretofore  and  its  computations: 
feasibility  *  i  ,  e ,  ,  solution  at  and  continuing  from  ?  -  0),  h'LE  tech¬ 
niques  promise  a  useful  approach  to  complicated  nonlinear  problems  of 
the  hind,  examined  here. 

Firruas  BsysLoarewTs 

1.  As.  Imyoi-Cant  resafninjt  j>robl«tt  is  that  of  determining  the 
relationship  between  the  ®a5i»*ia- likelihood  estimate  and  the  minimum 
RMS  (error)  cs  client*  . 

2.  The  bi«&  of  tne  max iauro- like  11 hood  estimate  is  not  understood 
end  needs  to  be  further  examined, 

3.  The  error  evaluation,  rands  here  needs  to  be  extended  to  include 
estimates  of  the  RKf.  error. 

4.  More  efficient  programs  need  to  be  devised,  now  that  it  has 
been  demon?)? rated  chat  the  Algorithm  developed  here  is  useful  in  the 
vector  case.  This  will  assist  in  the  evaluation  in  3  Juafc  above. 
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Appendix  A 

DERIVATION  OF  CONDITIONAL  PROBABILITY  p(clz(s),  0  £  s  <  T) 
Defining 

X(s;C)  =  X(s),  given  that  X(G)  -  C,  (Ala) 

and 

X(0)  =  XQ,  (Alb) 

then  for  given  =  C  we  have 

ECZ<8)jcl  =  Ef*H(s)X(s  jC)  +  R(s)  j  =  H(s}X(s;C)  +  EfN(s)] 

=  H(g)X(s ;C) ,  (A 2) 

and  the  proce  ...  covariance  conditioned  upon  C  is 

R(s  ,  u  j  C )  =  Ef(Z(s)  -  EfZ(s) |cl)(Z(u)  -  ETz(u) |cU* jc] 

*  E|TH(s)X<s;C)  +  N(s)  -  H(s)X(s;C)l 
^H(u)X(u;C)  +  N(u)  -  H(u)X(u;C)]  1 
=  EirN(s)N*( u)  1  =  R(s  ,u)  . 

That  is,  the  covariance  function  of  Z(s)  is  independent  of  Xq  -  C 
and  is  the  same  as  the  noise  covariance  function.  However,  the  mean 
varies  with  X^  =  C  as  per  (A2) , 

We  will  assume  for  this  development  that  R(s,u)  satisfies  certain 
conditions  permitting  the  following  manipulations;  these  conditions 


will  be  spelled  out  later. 
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Let 


be  the  orthonormal  eigenvectors  and  the  related  eigenvalues  of 

k  k 

rk(s,u);  that  is,  ^.(s),  X. }  satisfy 


L 


rk(s,u)^(u)du  = 


k. 


(A3) 


(Note:  y^(u)  is  a  scalar  function.) 


Define 


/  2k(£)^(s)ds»  and 

»  A 


(A  4) 


n*(C)  =  EC*J|C\ 


(A5) 


Then 


m*(C)  -  E 


e  T  T 

•<  /  zk(s)  ^(s)ds  |C^  *  j  Efz^Cs)  |cHj(s)ds 

■'J0 


or ,  from  (A2) , 


i. 

m^‘(C)  =  Hk(s)"'s  ;C)  iij(s)de  , 


where 


(A6) 


H(s) 


Vs>) 

\<»>/ 


i.e.,  Kt(s)  *  (hkl(s)  ...  \m(s)). 


km 


Then,  defining 


rJ.(C)  =  E f rz^  -  -  ECz’MOljcl, 

ij  i  f  J  J 


one  gets 


ri j<C) 


nt  (s)nk(.u)  ii^(s)  ^(u)dsdu  |C 


1  I 

LL  E*“nk(s)nk(u)  j C  1  ^ (. s )  ^(u)dsdu; 


r^C>  = 


1  X 

’ If 

“  n  ~ i'i 


rk(s  ,u) 'Ji^(s)  i!>^(u)dsdu  - 


independent  of  C,  so  ».A3)  gives 


i 

=  j  y^u^ds  = 


xWr 


(A10) 


where  6 ^  is  the  Kronecker  delta: 


1 .  i  =  j 

0,  i  j 


(All) 


k  v 

It  is  also  clear  by  analogy  with  (A8)  that  and  z^  are  independent 
for  k  +  v ,  since  n^s)  and  n  (*->)  are  independent  noise  processes.  We 
thus  can  define  the  random  vector  series  fZ(0,  i  =  1,2,  ...  K  where 


Z(l) 


(Al?) 
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and  we  know  from  above  that  Z(i|c)  is  gaussian  with  mean 


M(i;C)  - 


and  covariance  matrix 


R(i) 


(independent  of  C) , 


where  m^(C)  is  given  by  (A6): 

T 

k  /*  k 

mi(c)  =  /  Hk<s)x(s:C)V8)d?- 

*0 

Letting  p[C,Z(l),  ....  Z(£)  1  be  the  probability  density  for 
rc,Z(l> . Z(Z)],  we  can  write 


PrC,Z(l),  ....  Z(/)]  =  p(C)p(Zi.  1)  ,  ....  Z(£)|c) 

=  pfzd),  ....  Au)]prc!z(i),  z (z)i, 


and  so 


prc|z(o,  ....  zu)" 


=  KlP(C)p(Z(l) ,  ,  Z(i) |C). 


(A13) 


(A  14) 


(A6) 


(A  15) 


(Alb) 
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Thufl,  p(C | Z( s)  ,  0  <  s  <  T)  =  p(c|z(l),  Z ( 2)  ,  is  given  h 


1  ins  pr  C|2(  1)  ,  Z  (>’)'! 


v2  exp 


[■»{ 


(C  -  n)*A-1(C  -  p.)  +  E  rz(i) 

i  =  l 


-  M(i,C)]*R"1(i)fZ(i)  -  M(i 


(i,C)lJ] 


Then,  T.  TZ(i)  -  M(  i  ,C)  ]*R*  1  ( i)  TZU)  -  M(i,C)l 
1=1 


p  1  -  k  k  7 

E  E  ~  *  z*  -  mX(C)l 

1=1  k=l  1*  1  1 


1  rr  k  ,2  k  k 


E  E  rfz*r  ••  2z*  m*(C)  +  fm* 
1=1  k=l  X  11  1 


k(C) 


p  on 

E  E 

k=l  1=1 


,  k  l 


k  k r  k.„.  , 2 
z  m  (C)  fm  (C)^ 

2  — i -  +  — i— - 


i 


P  *  1  U  2  P  k  P  ■  n 

E  v  -T  K  >  -  2  E  a  (C)  +  •  f  vK(C)  T 

k=l  1=1  X*  k=l  k=l 


Llk(C) 


k  k,  . 
""  z,  m  (C) 

E  — — r - 


1=1  X 


r>k(c)l2 


fmk(C)  1  2 

V  _ J _ 

1  =  1  xk 


(A  17) 


with 


(A  18a) 


(A  18b) 


-77- 


pfc  | Z(  s)  ,  0  <  s  <-  T"!  = 


K3  exp 


(C  -  d.)*/._1(C  -  u) 


zk(s)g  ( s  ;C) os 


T 

^  j  Hk(8)X(8;C)gk(8;C)dsJ  , 


where  g  (s;C)  satisfies 


g  (s >C)rk(s  ;u)ds  *  Hk(u)X(u;C) 


If  we  define 


ck(s)  =  ~  t  +i<s>. 

k  t=i  1 


there  results 


T 

y*  * 

I  '  (s)r  (s,u)ds  -  F  zk  ik(u) 

i-1 


k 

provided  (again)  that  the  •  •  (u))  are  complete. 


r.  (u> 

K 


That  is 


r  k-,2  T 

>  •  I 


rk(»).'k(8)ds. 


where  £  ( s )  satieties 

V 


/  V 


s)  r,  (s ,u)ds  *  z,  (u)  . 
x  k 


(A24a) 


(A  24b) 


(A25a) 


(A25b) 
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eigenvalues  such  that  the  fX  )  are  positive  or  zero  and  bounded 

k 

above;  there  are  no  other  eigenvectors  orthnorma'.  to  these  f;^(s)1; 
and  the  f'Jj  .  (s)}  are  complete  over  the  functions  of  integral  le  square 
or.  (OjT).  The  latter  means  that  for  any  function  g(s)  of  integrable 
square  it  can  be  vritten  as 


g(s) 


l.i.m.  Z  g  f(s), 
1  =  1 


where 


gi 


l 


(s)us 


and  the  meaning  of  l.i.m.  is  that 


1  im 

S' 


1 

/ 


g(f 


11  i  V5 


ds  -  0. 


Mercer's  Theorem  then  applies,  which  states  that 


r .  (  s  ,  u )  - 

K 


i  ”  i 


‘t  Vs),i(u)  • 


where  the  convergence  is  uniform  in  (s  .til  for  n  *•  s  .  -j  -  T,  Further , 
Picard's  Theorem  then  states  th.  u  the  g\s  ;f  in  (Aif  exists  and  ss 
of  i ntegraMe  square  ever  (0,T)  if  and  on ! v  if 


H,  !.  fX  i  u  ;f 


N  k 

,  —  -  •  K. 

:  'if  .(u.'. 

N  --  i  - 1  i 


where 
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k 

B(C) 

i 


I  ^(it)X(v;  C)v,(u)d«. 


and  the  series 


r 


B(C> 
i 


'-i 

1=1  L A .  j 


Picard’s  Theorem  further  states  that  the  unique  solution  is  thev 


H 


k 

6(C) 


gU{s;C)  =  1.1. m.  Z  ¥*(»), 

H-*®  X 


k  k 

as  in  (A20)  with  0(C)  =  m(C) . 

i  i 

Similar  remarks  (from  Picard's  Theorem)  apply  to  ^(s)  of  (A25)  where 
now  we  insist  that 


-  C^32 

r  - i -  <  00  , 

i-1  fx^l2 


i 


VV  =  /  2k(s)*^(8)ds , 


where 


The  unique  expression  for  rk(^'  is  then 

is)  =  1 ,  i  .rn. 
k’  N- 

k  it 

as  in  (A2!>a)  with  >  =  z .  . 

11 

We  thus  see  that,  necessary  and  sufficient  conditions  fcr  the  pre¬ 
ceding  development  are 

L.  r  (s,u)  positive  definite,  k  -  3.,  ....  P. 


i 


i=l 


rmJ{C) 

-W 


,  p, 


=  1, 


where  X*,  zk ,  and  nik(C)  are  defined  in  (A3),  (A4)  ,  e'.nd  (A6) . 

Equation  (A26)  simplifies  greatly  when  the  noise  N(s)  is  such  as 
to  permit  the  above  development  and  the  r,  (s,u)  can  be  treated  as 
delta -funs t ions : 

rk(s,u)  -  6 (s  -  u)  ,  k  =  1 , 

From  (A26)  we  then  have  £  (s)  and  t^(s;C)  satisfying 


P- 
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and 


JL 

f 


g  (s;C)ik  5(S 


u)ds  H  (u)X(u;C), 


Ss'>  =  r*  zw(s), 


'k 


gK(s;C)  =  ~  H, (s)X(s  ;C> 

K  '  *  / 


so  that  (A2fe)  may  be  rewritten  as 


p(C  jZ(s)  ,  0  s  T)  =  K  J(C) 


where 


J (C)  =  exp 


-  2  |(C  -  u)  *■'"*<€  -  u) 


T 

/  U(s)  -  H(fe)X(s:C)  jV1fZ(s)  -  H(s)X(s;C ]dsl 

*'n  J  J 


/! 


R  = 


o 


O  U  * 

lw  P/ 


-i  r 

K  =  j  J(C)de . 
C 


(A  2  7a) 


(A  2  7b) 


(A27c) 


(A27d) 
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Appendix  B 
PROGRAM  FLOW  DIAGRAM 

In  Fig.  10  is  shown  a  flow  diagram  depicting  the  operation  of  the 
maximum-1 ikelihood -estimator  program.  The  following  describes  what 
takes  pic  e  at  particular  points  in  the  program. 

1.  Star”, 

2.  Input  of  indices,  constants,  observation.  T  is  the  observation- 
interval  upper  limit,  f }  are  the  points  at  which  estimates  are 
made,  X(T^)  is  the  estimate  at  time  T  . 

3.  Question  leading  to  use  of  certain  input  values  if  answer  is 
no;  leading  to  update  if  answer  is  yes. 

4.  Update  of  X(T^)  to  X(T^  +  AT)  via  basic  differential  equation 
(27)  governing  X(T)  and  (60)  . 

5.  Update  of  time  from  to  +  AT. 

6.  Setup  of  initial  conditions  for  integration  to  get  functions 
tabled  over  0  *"  s  T  which  are  needed  in  Newtonian  iteration 
in  13,  as  well  as  needed  in  4. 

7.  Integration  to  table  over  0  •  s  •  T  functions  needed  for  New¬ 
tonian  iteration  in  13. 

8.  Print  X(T.)  ,  DetT7**f  (XfT^  ,T. )  1 ,  ^ffXd^  ,T  )  , 

^X(T  -XfT ,1)  . 

A  1  1 

S.  T-' " f  of  whether  f(X(T  )  ,T )  is  near  enough  to  0, 

A  11 

10.  If  '7*f(X(T.)  ,T^)  is  sufficiently  near  ft  in  4,  i  there 

A 

has  upon  calculated  a  previous  Newtonian  solution  X  at  T 

0  i 
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11.  If  there  was  a  previous  Newtonian  solution  X^.  at  ,  compare 
the  difference  between  it  and  the  current  solution  to  see  if 
there  is  need  to  proceed  further  with  Newtonian  iterations. 

12.  If  the  difference  between  solutions  in  11  is  too  great.,  tem¬ 
porarily  store  the  latest  solution  as  the  best  one. 

13.  Improve  this  best  solution  via  another  Newtonian  iteration. 
Then  go  back  to  6  and  redo  this  series  of  tests  et  al . 

A  A 

14.  If  ^f  (X(T . )  ,T . )  is  near  enough  to  0  or  Xn  is  near  enough  to 

A  1  1  VJ 

X(T^)  ,  ask  if  =  T.  If  I\  /  T,  go  to  4  and  update  X(T^) 
to  X(I^  +  AT)  via  basic  differential  equations  (27)  and  (60) , 
e  tc . 

15.  If  in  14  =  T,  stop. 
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